Jul 30 2010

Enjoying UNIX

Category: zvolkov @ 19:39

Here's what I've been up to lately: FreeBSD, vim, tmux, command line twitter client (ttytter) and unicode-capable graphic terminal (jfbterm). No X-Windows involved!

Tags:

Jul 16 2010

Insert-or-Update records in bulk with NHibernate batching

Category: zvolkov @ 14:42

I have a table of Prices: ProductID, Date, Price and Source. My application needs to load new prices but if the price already exists and it is from a "better" source, it should update existing record.

The orignal app loaded all data in SQL temp table and did a bulk insert to the target table, followed by a bulk update. Kinda simple and effecient even if too SQL-centric. The first version of my app used to check every ProductID+Date combination for existance using regular stateful NHibernate session, then load new prices using SQL Bulk load, and update each existing price using regular NHibernate session. Needless to say, not only it was slow but the gap between the read and the insert allowed new prices from other sources (my app is not the only source) to sneak into the table causing duplicate/ambigous records to appear. I needed a solution that would allow me to Insert-or-Update records fast, without resorting to temptables+SQL mess. (I have a habit of trying to minimize the amount of SQL code in my application but if nothing else worked, I would have to fall back to the old ways.)

Here's the outline of my new solution:

  • Stateless NHibernate session (IStatelessSession) for extra efficiency. No need to spend CPU cycles keeping that first-level cache growing.
  • Single explicit transaction wrapping entire load operation for all records. This reduces stress on SQL Server by not having it commit every record. If my volume becomes too high I can always limit the transaction size to say 10000 rows.
  • Custom sql-insert script defined in the entity's mapping file that implements the read-update-or-insert logic . The idea is to be able to completely process one record without going back-and-forth between SQL Server and the .NET
  • NHibernate batching enabled, in order to push the records hundreds at a time to the SQL Server.

Combined, these techniques are designed to make the whole operation less chatty and, in a way, achieve the effect of having the processing done on the SQL Server side.

Here's what I discovered while implementing the above:

  • IStatelessSession does not support .BeginTransaction(IsolationLevel) overload. Instead, do session.Transaction.Begin(IsolationLevel.Serializable) which does exactly same thing (I looked at NH source code)
  • You can further optimize the logic by not doing session.Get for each many-to-one-property of your entity. In my example, instead of doing price.Product = session.Get<Product>(ProductID) I can create a dummy Product object once and simply set its Id to a different value for each price. Stateless session does not care that the object is transient.
  • The custom sql-insert script is sensitive to the order of parameters (they're simply represented by "?"s). Turn on NH debug-logging and look at NH log to see the exact order in which object's properties are mapped to SQL params. Do this before you implement sql-insert.
  • The custom sql-insert script is sent to the server once for every inserted record. You may want to move it to an SP to minimize the traffic, and execute the SP from sql-insert. (Note that prepare-sql setting has no effect on batched statements)
  • In order for NH batching to work, the entity's Identity must be assigned before the insert. Can't use generator=native (aka SQL Server IDENTITY).
  • Since your Insert is now in fact Insert-Update-Or-Do-Nothing you want your entities Identity be its natural key, in my case ProductID + Date. That requires implementing Equals and GetHashCode but other than that it's real easy.
  • NH batching requires check="rowcount" setting on the sql-insert. This means your script must return number of records affected, whether it actually inserted, updated or did nothing.
  • NH batching uses global settings for its CommandTimeout, not the one defined in your SessionFactory. This means you must add hibernate-configuration section to your App.Config (or Web.Config).

This summary turned out to be more elaborate than I expected. No details for you then. You can figure it out, I have trust in you.

Tags:

Jul 11 2010

A joke that made me laugh

Category: zvolkov @ 19:28

So a Java developer, a web developer and a UNIX admin go to a bar to get a drink. The Java developer and the web developer get into an argument over who's job is more important. "Thousands of people see my work" claimed the trendy web developer. "My code runs the banks!" retorted the tidy Java developer. They both look to the UNIX admin, expecting him to have the final word. He puts down his drink slowly. He then smashes his glass on the table, jumps on it and rips off his pants. He takes a shit and rubs it in the Java devs face, and while the web dev makes a run he jumps him, ripping his eyes out and eating them. He then goes back to the cave where he lives.

Tags:

Jul 10 2010

How to enable Alt (Meta) on FreeBSD

Category: zvolkov @ 16:19

In order to make bash shortcuts like Alt+. work in virtual terminals, you need to modify the keyboard mappings to map the left Alt to work like Unix special "Meta" key.

You can either start with one of the existing mappings, like this:

sed 's/lalt/meta/g' /usr/share/syscons/keymaps/us.iso.kbd >  /usr/share/syscons/keymaps/local.kbd

Or, if you're not sure which keymap file you're currently using you can get your current keybindings using kbdcontrol:

kbdcontrol -d | sed 's/lalt/meta/g' > /usr/share/syscons/keymaps/local.kbd

Then, assuming you want this change to be global, edit your /etc/rc.conf file and add the following line:

keymap="local"

Tags:

Jul 8 2010

UNIX (and bash) command line tricks

Category: zvolkov @ 08:06

Just a random selection of keyboard shortcuts and other productivity tricks for bash.

Keyboard shortcuts: (use bind -p to see all current mappings)

  • Ctrl+a -- begining of line
  • Ctrl+e -- end of line
  • Ctrl+u -- cut head
  • Ctrl+k -- cut tail
  • Ctrl+y -- paste
  • Ctrl+_ -- undo (may require pressing Shift, depending on your keyboard mappings)
  • Ctrl+L -- clear screen
  • Alt+b -- skip to next word boundary (if Alt doesn't work on all terminals, see the following post or try Esc followed by the char instead)
  • Alt+f -- skip to prev word boundary
  • Alt+d -- delete next word
  • Alt+backspace -- delete prev word
  • Ctrl+w -- cut word
  • Ctrl+R -- search command history
  • Alt+. -- repeat last command's argument

Add following to .inputrc:

  • "\e[3~": delete-char -- enable normal Delete function (as opposed to both Backspace and Delete working as Backspace)
  • "\C-?": delete-char -- same as above but works on virtual terminal (not just on SSH)
  • "\e[1~": beginning-of-line -- enable Home button
  • "\e[4~": end-of-line -- enable End button
  • "\e[1;5D": backward-word -- enable Ctrl+Left (only works on SSH terms)
  • "\e[1;5C": forward-word -- enable Ctrl+Right (only works on SSH terms)

History tricks:

  • alias h = 'history 25' -- add to .bashrc (call .bashrc from .bash_profile using the ".")
  • PS1 = "\!:\W\$" -- add to .bash_profile to set prompt to [History#]:[CurrentDir]$
  • !$ -- substitute last argument of last command
  • !123 repeat history number 123
  • export HISTIGNORE=$'[ \t]*:&:[fb]g:exit:ls' -- add to .bashrc, this way exact dups, ls, fb, fg, exit, and lines starting with space won't be added to history
  • PROMPT_COMMAND='history -a' -- add this and next line to .bashrc to make bash save history immediately, this way history will work right with two simultaneous terminals
  • shopt -s histappend

Tab-completion:

  • set completion-ignore-case on -- add to .inputrc to enable case-insensitive completion
  • set show-all-if-ambiguous on -- display all completion options on single Tab rather than on double-Tab
  • case $- in -- install bash-completion and add this to .bashrc to get Tab-completion for command arguments (the path can be different)
       *i*) [[ -f /etc/bash_completion ]] && . /etc/bash_completion ;;
    esac

grep:

  • grep -options 'word' filename -- general syntax
  • -A4 -B4 -- show +- 4 lines of context
  • -v -- NOT match
  • -i -- case insensitive
  • -w -- separate word
  • -n -- show line numbers
  • -r -- recursive

Other useful stuff:

  • alias cd='pushd > /dev/null' -- add to .bashrc to map cd to pushd, this way the next alias bd can be used to go back
  • alias bd='popd'
  • find . -type f -exec grep "word" /dev/null {} \; --find word recursively in all files
  • sed 's/old/new/g' filename -- find and replace

Tags:

Jul 5 2010

Learning UNIX part II

Category: zvolkov @ 21:49

Quick update: the learning is going on really good. MINIX turned out to be a piece of crap, even as a learning tool. FreeBSD is much more solid. Unix System Administration Handbook turned out a better book than I expected it to be. Less about administration and more about fundamentals. The Unix Time-sharing System has been a godsend. An Introduction to the UNIX Shell was very helpful too. These 3 have my topmost recommendations to anyone starting with *nix.

I'm currently running FreeBSD 8.0 under Windows 7's Virtual PC (the one that comes with XP Mode). The biggest pain was getting DHCP to work. The trick was to force full-duplex on the virtual Ethernet adapter and set DHCP client to run synchronously at the boot time. Here's how my /etc/rc.conf looks now:

font8x8="cp437-8x8"
font8x14="cp437-8x14"
font8x16="cp437-8x16"
allscreens_flags="MODE_32"
synchronous_dhclient="YES"
ifconfig_de0="DHCP media 100baseTX mediaopt full-duplex"

The top 4 lines configure support for 30-vertical-lines screen, as opposed to default 25-lines one.
The bottom 2 lines configure DHCP.

Another pain was adding second virtual harddrive and mounting it as /home -- I finally did it using the "dedicated" method from FreeBSD Handbook.

Finally, in order to get VM to feel more responsive I had to add following setting to my /boot/loader.conf:

kern.hz=50

This is pretty much it, everything else was by-default. This was tons of fun!

Tags:

Jun 26 2010

Learning UNIX

Category: zvolkov @ 15:14

My master's thesis project was written in PHP + MySQL. Back then I could install Linux, rebuild its kernel and do basic administrative work. All of that is mostly forgotten by now. Lately I've been contemplating Windows culture,  how it affects my quality of life as a programmer, and speculating about virtues of UNIX universe. Arguably, even such remote offsprings as Java and Ruby can be related to UNIX culture. And who of Fortune 100 companies run Microsoft? Google? Facebook? Wikipedia? None!

Thus I decided to refresh my UNIX. I started with quick review of linages and distributions. Of hundreds of them, I centered on Debian, FreeBSD, and MINIX. Debian -- because it's the biggest and longest surviving truly non-commercial project. FreeBSD -- because it seems to be the most authentic descendant of the original UNIX. And MINIX -- because it is a minimalistic UNIX-like OS that has a companion book explaining low-level kernel design, complete with source code.

At the moment, I've installed MINIX in a VM and am playing with the shell. I've also identified a few key topics and did some significant googling to find best articles to quickly get myself up to speed (the ones that I both read and loved are highlighted in bold).

Of all the books praised on the Internet my local library had Unix System Administration Handbook, A Practical Guide to Linux, by Mark G. Sobell  and Mac OS X Tiger for Unix Geeks -- all three are highly praised books.

Let the fun begin!

 

 

Tags:

Jun 23 2010

Testing MSMQ from Powershell

Category: zvolkov @ 16:01

Just a quick note to myself on how to ping a queue using PowerShell (note how verbose the syntax is!)

[Reflection.Assembly]::LoadWithPartialName( "System.Messaging" )
$msmq = [System.Messaging.MessageQueue]
$mq = New-Object $msmq("FormatName:DIRECT=OS:SERVERNAME\Private$\QUEUENAME", $False, $False, [System.Messaging.QueueAccessMode]::Peek);
$mq.Peek([System.TimeSpan]::FromSeconds(1))

Tags:

Jun 21 2010

Goomgum - my first ever OSS project

Category: zvolkov @ 16:00

Opsource Cloud Restful API is a RESTful web service interface that allows users to control their cloud environment over HTTPS.

Goomgum is a .NET library that can be used to simplify access to Opsource Cloud API from .NET applications.

Goomgum wraps REST-style API operations in an OOP-style class library that can be used as a base for creating your own Opsource Cloud automation tools in .NET. Goomgum currently features a small but growing subset of the parent API, a set of unit-tests that does not require access to the live API, and a sample command line utility that consumes the library. With goomgum you can use LINQ and other .NET niceties to manage swarms of VMs with ease:

using OpSource;
using System.Linq;

var c = new Cloud("login", "password");
Account a = c.Authenticate();
var servers = a.GetServers().Where(s => s.Name.StartsWith("test"));
foreach(var s in servers)
    s.Start();

Feel free to download the binaries and try for yourself.

Goomgum is currently developed in C# for .NET Framework 4.0 Client Profile. The client library itself does not have any third-party dependencies, but the Unit Tests and the Command Line App depend on a few free libraries & tools such as Rhino Mocks, Command Line Parser Library and ILMerge.

If you want to participate I will be glad to give you write access to the repository. The project is currently hosted on Google Code.

Tags:

Jun 18 2010

Environment-aware Configuration with DNS-based Environment Determination

Category: zvolkov @ 15:36

Inspired by this article, I decided to try DNS-based Environment Determination (DBED).

DBED is an Configuration Management technique by which you can minimize the effort/overhead of maintaining mulitple Technical Operation Environments (DEV, TEST, QA, UAT etc.). The problem that DBED intends to solve is proliferation of configuration changes growing as carthesian product of number-of-Environments times number-of-Dependencies-Between-Services. So if you have 2 environments with 2 web services and a SQL Server, you'd have to manage total of 4 configuration entries. This quickly grows out of hand when you have to manage 20 interrelated services across 5-6 environments. The promise of DBED is to deploy same exact configuration of a service to each of the environment-specific servers and have it know which servers its dependencies live on. Hence the name: environment-aware configuration.

The approach is to create a subdomain for each of the environments (e.g. dev.internal.company.com, qa.internal.company.com etc.) and then in each subdomain create bunch of aliases to various servers, (e.g. mainsql.dev.internal.com --> DB123.internal.company.com, mainsql.qa.internal.com --> DB456.internal.company.com etc.). These aliases (unqualified by the full domain suffix) is what you then point to in your config files. But how to "route" the request to a different subdomain based on the requesting server's environment? By modifying that server's DNS suffix sequence! Here's how (assuming your DNS is on Windows):

First, get your IT/NOC/Admin department's buy-in and support. Indeed, you'll need permissions to access your Domain Controller (this is where DNS Server typically runs on) and/or Active Directory (if you take the Group Policy route, more on this later). Alternatively, if you can bring up your own DNS Server and have enough rights to point your non-PROD servers to it, you may be able to stay self-sufficient a bit longer.

Second is to connect to the DNS Server and create new Primary Zone, under one of the subdomains you control. For example in my case we had internal domain called Tech1.Corp.Local under which I created Cloud1.Tech.Corp.Local, Cloud2.Tech.Corp.Local etc. You should also create "Delegation" records in the parent subdomain (Tech.Corp.Local) to point to the same server (unless your CloudX zones will be on a different DNS Server). Simple scenarios (e.g. a client PC directly asking for name resolution) will work even without Delegation records, but more advanced scenarios like recursive queries made by other DNS servers may be broken. Then again, if you are in a complex DNS environment you'll probably require specialist's help anyway. Alternatively you can create the environment subdomains directly in the existing zone (Tech.Corp.Local) as opposed to creating bunch of new zones. If you do that you don't need to create delegation records. It actually does not make hell lot of a difference whether to do subdomains or delegations. With delegations you'll have an option of easily moving the separate zones to another DNS server whereas with subdomains you'll have to recreate them by hand.

Third is to create CNAME (Canonical Name) records in your environment zones (or domains) to map your aliases to the actual names of the servers. The reason we're creating CNAME and not A records is to avoid creating a hard link to the servers' IP addresses. Instead, CNAME simply redirects DNS lookup from the alias to an existing name. This way your aliases will keep working even if the servers' IP addresses change. Once you're done you should have something similar to the picture above. Repeat this for each environment.

Finally, you will need to modify the DNS suffix sequence on each of the servers in each environment. In the Network settings, under TCP/IP 4, Advanced, DNS tab; there's an option for searching various domains. By default it's set to use only the primary connection suffix (domain); but you can switch it to use a list you set (make sure you include all the domains you want it to search). When resolving "maindb" without any domain, it'll iterate through the list until it finds a resolution. Of course, this doesn't work if you specify a full FQDN in your config file. In my cloud example, my suffix list needs to have Cloud1.Tech1.Corp.Local at the top, followed by Tech1.Corp.Local (followed by my cloud provider's suffix). With this setup an unqualified name will first try to resolve in the Cloud1, and then in the Tech1. This allows our aliases to map to corresponding environments, while still keeping all other names functional. If RDPing into each server and modifying its DNS settings feels like a pain (which it is!) you may try to use Active Directory's Group Policies to push the suffix list to corresponding servers. I was told it's possible but as I have not tried it myself I will "leave it as an exercise" for you. UPDATE: As explained here DNS Suffix Search List can be modified remotely from command line by doing wmic /node:[machine] nicconfig call SetDNSSuffixSearchOrder (Cloud1.Tech.Corp.Local,Tech.Corp.Local) and you can also use wmic /node:[machine] nicconfig get DNSDomainSuffixSearchOrder to verify it.

Once all of the above is done, restarting either of the servers should not be required, just try pinging both your new aliases as well as regular names and make sure everything works. Now point your configs to aliases and enjoy environment-aware configuration!

 

Tags: