2015-12-22

Testing a Perl Script

(Up-to-date source of this post.)

At $work, I was to upgrade several Debians from Squeezy through Wheezy to Jessie (6 to 8). I wanted to be sure that after the upgrade (mostly) the same processes are running as before. I whipped up a script, that simply stores the list of running processes before the upgrade. When run subsequently it reports missing processes (if any).

To make the script reliable and easy to maintain I wanted to test it somehow. To do that I turned the script into a modulino following brian d foy's advice in chapter 17 of Mastering Perl. The trick was to put all the code into subroutines that can be tested and using the caller() function to decide whether the script is used as a script or as a module. The script looks something like this now:

#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
use autodie;
use Getopt::Long;
use Pod::Usage;
use Storable qw(freeze thaw);

GetOptions(
    "h|?|help"  => \my $help,
    "l|print"   => \my $print,
    "v|verbose" => \my $verbose,
    "n|net"     => \my $net,
) or pod2usage(1);
pod2usage( -exitval => 0, -verbose => 2, -noperldoc => 1 ) if $help;

run() unless caller();

sub run {
    # code
}

sub missing_procs {
    # code
}

sub get_procs {
    # code
}

After this modification I created a symlink

ln -s checkprocs checkprocs.pm 

and wrote a couple of tests in checkprocs.t

use strict;
use warnings;
use Test::More tests => 3;

use_ok('checkprocs');

#<<<
my $old = [(
    'proc1',
    '/path/to/proc2',
    'proc3',
    'proc4 --with-arg',
    '/path/to/proc5 -w',
)];
my $new = [(
    'proc1',
    'proc3',
    '/path/to/proc5'
)];
#>>>

{
    my @missing_procs = main::missing_procs( $old, $new );
    is(
        "@missing_procs",
        '/path/to/proc2 proc4',
        'Found missing process w/o args'
    );
}

{
    my @missing_procs = main::missing_procs( $old, $new, { verbose => 1 } );
    is(
        "@missing_procs",
        '/path/to/proc2 proc4 --with-arg',
        'Found missing process w/ args'
    );
}

Since I need to run the script under different Perl versions (Squeeze had 5.10.1, Wheezy 5.15.2 and Jessie 5.20.2) I used perlbrew to test it:

$ perlbrew exec prove checkprocs.t
perl-5.10.1
==========
checkprocs.t .. ok
All tests successful.
Files=1, Tests=3,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.04 cusr  0.00 csys =  0.05 CPU)
Result: PASS

perl-5.14.2
==========
checkprocs.t .. ok
All tests successful.
Files=1, Tests=3,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.04 cusr  0.00 csys =  0.05 CPU)
Result: PASS

perl-5.20.2
==========
checkprocs.t .. ok
All tests successful.
Files=1, Tests=3,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.03 cusr  0.00 csys =  0.04 CPU)
Result: PASS

2015-12-08

canssh - can I ssh into the following hosts?

Update: the script has been merged to the sysadmin-util repository.

Do you have a list of hosts and you want to execute a command like

cat /etc/debian_version

on all of them? Wouldn't it be good to know whether you can ssh to all of them before writing the shell loop like

for h in $(cat hosts); do echo -n "$h "; ssh $h "cat /etc/debian_version"; done

In that case this bash script could be of use to you. It produces output like this:

2015-12-03

Linux Performance Analysis

(Up-to-date source of this post.)

Taking stock of hardware

Sources of hardware information:

lscpu
/proc/cpuinfo       # one entry for each core seen by the OS
free -m
/proc/meminfo
fdisk -l
/proc/diskstats

Desktop Management Interface (DMI, aka SMBIOS):

dmidecode -t <type>    # see "DMI TYPES" in manpage

Network:

ifconfig -a

CPU

Overall utilization

Is CPU the bottleneck?

$ vmstat 5 5 --unit M
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0      0    230    687  44366    0    0  2923  3037    1    0  4  3 85  7
 0  0      0    218    687  44380    0    0 76160    10 2814 4233  3  1 96  0
 0  0      0    224    687  44377    0    0 79462     0 3253 5979  3  2 95  0
 0  0      0    230    687  44374    0    0 82432    18 3069 5674  3  1 95  0
 1  0      0    233    687  44372    0    0 86400    18 3705 5215  3  2 95  0
  • first line reports averages since system's boot (the entire uptime), subsequent lines are averages within the previous sample period (default is 5 seconds)
  • r - runnable processes
  • b - processes blocked for I/O
  • in - interrupts
  • cs - context switches (number of times the kernel switches into kernel code; i.e. changing which process is running)
  • us - user time (the percentage of time the CPU is spending on user tasks)
  • sy - system (kernel) time
  • id - idle time
  • wa - waiting for I/O

On multiprocessor machines, most tools present an average of processor statistics across all processors.

High us numbers generally indicate computation, high sy numbers mean that processes are doing lot of syscalls or I/O. A rule of thumb for a general server is that the system should spend 50% in user space and 50% in system space; the overall idle time should be 0.

Extremely high cs or in values typically indicate a misbehaving or misconfigured hardware device.

Load average

How many pieces is the CPU divided into?

Average number of runnable (ready to run) processes:

$ uptime 
 13:03:23 up 8 days, 13:06,  2 users,  load average: 1.13, 1.31, 1.38
  • 5, 10, and 15-minute averages
  • process waiting for input (e.g. from keyboard, network) are not considered ready to run (only processes that are actually doing something contribute to load average)
  • on a single-processor system -- 3 usually means busy, > 8 means problem (you should start to look for ways to spread the load artificially, such as by using nice to set process priorities)
  • on a multi-core system -- if number of cores = load average, all cores have just enough to do all the time

The system load average is an excellent metric to track as part of a system baseline. If you know your system’s load average on a normal day and it is in that same range on a bad day, this is a hint that you should look elsewhere (such as the network) for performance problems. A load average above the expected norm suggests that you should look at the processes running on the system itself.

Per process consumption

Which processes are hogging resources?

Snapshot of current processes:

$ ps aux
  • m - show threads

Processes and other system information regularly updated:

$ top
  • z, x - turn on colors and highlight sort column
  • Spacebar - update display immediately
  • M - sort by current resident memory usage
  • T - sort by total (cumulative) CPU usage
  • H - toggle threads/processes display
  • u - display only one user's processes
  • f - select statistics to display

On a busy system, at least 70% of the CPU is often consumed by just one or two processes. Deferring the execution of the CPU hogs or reducing their priority makes the CPU more available to other processes.

How much CPU time a process uses:

$ time ls    # or /usr/bin/time
  • user time - time the CPU spent running the program's own code
  • system time - time the kernel spends doing the process's work (ex. reading files or directories)
  • real/elapsed time - total time it took to run the process, including the time the CPU spent running other tasks

Threads

Some processes can be divided into pieces called threads:

  • very similar to processes: have TID, are scheduled and run by the kernel
  • processes don't share system resources
  • all threads inside a single process share system resources (I/O connections, memory)

Many processes have only one thread - single-threaded processes (usually called just processes).

All processes start out single-threaded. This starting thread is called main thread. The main thread then starts new threads in similar fashion a process calls fork() to start a new process.

Threads are useful when process has a lot to do because threads can run simultaneously on multiple processors and start faster than processes and intercommunicate more efficiently (via shared memory) than processes (via network connection or pipe).

It's usually not a good idea to interact with individual threads as you would with processes.

Memory

See also posts/linux-ate-my-memory.

Amount of paging (swap) space that's currently used:

# swapon -s
Filename                Type        Size    Used    Priority
/dev/sdb2               partition   7815616 0       -1
  • in kilobytes

vmstat (see above) fields:

  • si - swapped in (from the disk)
  • so - swapped out (to the disk) => if your system has constant stream of page outs, buy more memory

Storage I/O

$ iostat 5 5
Linux 3.2.0-4-amd64 (backup2)   06/14/2015  _x86_64_    (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.80    0.34    3.17    7.49    0.00   85.20

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdb              49.61      1852.45       349.64 1369392967  258461851
sdc             301.74     21510.91     24545.93 15901546498 18145130448
sdd              75.02      6184.17      6195.25 4571531985 4579724644
sda             307.37     16906.94     17127.65 12498149921 12661307662
dm-0            131.14      8082.58      9533.25 5974897325 7047285056
dm-1            172.96     13428.25     15012.67 9926593437 11097845392
dm-2            107.96      1612.16       347.05 1191762057  256547336
  • the first report provides statistics since the system was booted, subsequent reports cover the time since the previous report
  • tps - total I/O transfers per second
  • kB_read/s - average number of kilobytes read per second
  • kB_read - total kilobytes read

Processes using file or directory on /usr filesystem (mount point):

$ fuser -cv /usr
                     USER        PID ACCESS COMMAND
/usr:                root     kernel mount /
                     root          1 .rce. init
                     root          2 .rc.. kthreadd

.. ACCESS: * f,o - the process has a file open for reading or writing * c - the process's current directory is on the filesystenm * e, t - the process is currently executing a file * r - the process's root directory (set with chroot) in on the filesystem * m, s - the process has mapped a file or shared library

List open files:

$ lsof    # pipe output to pager or use options

Network I/O

To see info on network connections:

# netstat -tulanp
  • -t - print TCP ports info
  • -u - print UDP ports info
  • -l - print listening ports
  • -a - print all active ports
  • -n - don't reverse-resolve IP addresses
  • -p - print name and PID of the program owning the socket

To list all programs using or listening to ports (when run as regular user, only shows user's processes):

# lsof -ni -P
  • -n - don't reverse-resolve IP addresses
  • -P - disable /etc/services port name lookups

To list Unix domain sockets (not to be confused with network sockets although similar) currently in use on your system:

# lsof -U    # unnamed sockets have "socket" in NAME column

lsof network connections filtering

by protocol, host and port:

lsof -i[<protocol>@<host>]:<port>

.. ex.

lsof -i:22
lsof -iTCP:80

by connection status:

lsof -iTCP -sTCP:LISTEN

Resources

  • ULSAH, 4th, Ch. 29
  • How Linux Works, 2nd, Ch. 8
  • Corresponding man pages

2015-11-27

Big O Notation for Sysadmins

(Up-to-date source of this post.)

  • a mathematical way of describing scaling
  • used to classify a system based on how it responds to changes in input size
  • O is used because the growth rate of an algorithm's run-time is known as its order

Sub-linear scaling

  • O(1) - constant - no matter the scale of the input, performance of the system does not change (ex. hash-table lookup in RAM; such algorithms are rare)
  • O(log n) - logarithmic; ex. binary search grows slower as the size of the corpus being searched grows, but less then linearly

Linear scaling

  • O(n) linear - ex. twice as much data requires twice as much processing time

Super-linear scaling

  • O(n^m) - exponential - as input size grows the system slows down disproportionately
  • O(n^2) - quadratic (but everybody says exponential when they mean quadratic)

Resources

  • Practice of Cloud System Administration, Appendix C

2015-07-13

My First CPAN Contribution

Last week I uploaded App::Monport to CPAN. I did it to try out the CPAN toolchain and to get a feeling of accomplishment :-). But I also think the application might be useful for some folks.

Update: My application was mentioned in PerlTricks' article. Cool!

2015-06-29

Bash Login Scripts

(Up-to-date source of this post.)

When bash is started it runs a series of scripts to prepare the environment for user. These scripts, for example, set the environment variables, create command aliases, run programs.

Login shell Non-login shell
Global config /etc/profile, /etc/profile.d/ /etc/bash.bashrc, /etc/bash/bashrc, /etc/bashrc
User config ~/.bash_profile, ~/.bash_login, ~/.profile ~/.bashrc
  • login shell -- a shell started by the login program or a remote login server such as SSH; place for variables like PATH, PS1 and startup programs like umask
  • non-login shell -- not started by the login program, run on every instance (ex. shell inside an X-based terminal); place for aliases and functions

Creating a symlink between ~/.bashrc and ~/.bash_profile will ensure that the same startup scripts run for both login and non-login sessions. Debian's ~/.profile sources ~/.bashrc, which has the same effect.

More

2015-05-20

OsmocomBB

(Up-to-date source of this post.)

OsmocomBB (Open source mobile communications BaseBand) is an GSM Baseband software implementation. It intends to completely replace the need for a proprietary GSM baseband software. By using OsmocomBB on a compatible phone, you are able to make and receive phone calls, send and receive SMS, etc. based on Free Software. You can learn, hack and audit mobile networks with this tool.

Follow notes on how I got OsmocomBB runnning on Motorola C118 (brought to me by Mate :-).

Compile

  1. get started
  2. cd ~/osmocom-bb/src/target/firmware/
  3. uncomment CFLAGS += -DCONFIG_TX_ENABLE in Makefile
  4. read this and this

Run

load layer1 code into mobile phone RAM

  1. cd ~/osmocom-bb/src/host/osmocon
  2. sudo -E ./osmocon -p /dev/ttyUSB0 -m c123xor ../../target/firmware/board/compal_e88/layer1.compalram.bin
  3. shortly press On/Off button

run mobile - application implementing a regular GSM mobile phone (and more)

  1. cd ~/osmocom-bb/src/host/layer23/src/mobile
  2. sudo -E ./mobile -i 127.0.0.1

start terminal connection to mobile

  1. cd ~/osmocom-bb/src/host/osmocon
  2. telnet localhost 4247
    • enable
    • sim pin
    • show ms 1 <PIN>
    • show subscriber

Wireshark

To install and run follow this. Quick how-to run wireshark:

nc -u -l 127.0.0.1 4729 > /dev/null &   ## to discard ICMP port unreachable messages
sudo wireshark -k -i lo -f 'port 4729'  ## listen on loopback device, port 4729

System information type 4

  • This message is sent on the BCCH (Broadcast Control Channel) by the network to all mobile stations within the cell giving information of control of the RACH (Random Access Channel), of location area identification (LAI), of cell identity and various other information about the cell.
  • Source: I-ETS 300 022-1 (1998)
  • See also: Signaling Channels

GSM

(Up-to-date source of this post.)

Cellular network

  • a radio network distributed over land areas called cells
  • each cell is served by at least one transceiver - BTS (Base Transceiver Station) = cell site
  • this enables a large number of portable transceivers (e.g. mobile phones) to communicate with each other
  • example of a cellular network: the mobile phone network or PLMN

GSM

  • World's most popular standard for mobile telephony systems (80% of mobile market uses the standard)
  • both signaling and speech channels are digital (1G was analog, ex. NMT)
  • second generation (2G) of mobile phone system
  • GSM release '97 - added packet data capabilities via GPRS
  • GSM release '99 - higher data transmission via EDGE
  • UMTS (Universal Mobile Telecommunications System) - 3G mobile cellular technology for networks based on GSM standards
  • LTE - 4G, standard for wireless communication of high-speed data for mobile phones and data terminals, based on the GSM/EDGE and UMTS/HSPA

Mobile Technology Roadmap

Network Structure

GSM PLMN has two main logical domains:

  1. access network - most used access networks in western Europe as of 2009 (can be deployed in parallel):
    • GERAN (GSM EDGE radio access network)
    • UTRAN (UMTS terrestrial radio access network) - HSPA can be implemented into UMTS to increase data transfer speed
  2. core network
    • circuit switched domain
    • packet switched domain
    • IP multimedia subsystem (IMS)

GPRS/UMTS architecture with the main interfaces:

PLMN

The network is structured into a number of discrete sections:

  • the base station subsystem (BSS) - handles traffic and signaling between a mobile phone and the NSS (access network)
  • the network and switching subsystem (NSS) - part of the network most similar to a fixed network (VOICE, circuit switched)
  • the GPRS core network - optional part for packet based Internet connections (NON-VOICE, packet switched)
  • operations support system (OSS) for maintenance

See this picture for GSM communication.

BSC = Base Station Controller

  • intelligence behind the BTSs (allocation of radio channels, measurements from the mobile phones, handover control from BTS to BTS)
  • concentrator towards the mobile switching center (MSC)
  • the most robust element in the BSS
  • often based on a distributed computer architecture

PCU = Packet Control Unit

  • late addition to the GSM standard
  • processing tasks for packet data

MSC = Mobile Switching Centre

HLR = Home Location Register

  • database of subscribers
  • a central database that contains details of each mobile phone subscriber that is authorized to use the GSM and/or WCDMA core network of this PLMN

VLR = Visitor Location Register

  • register of roaming subscribers

AUC

  • database of authentication keys

EIR

  • stolen devices (phones) register

SS7 = Signaling System #7

  • a set of telephone signaling protocols
  • main purpose: setup/tear down telephone calls
  • other uses: number portability, SMS, etc.

SGSN = Serving GPRS Support Node

  • delivery of data packets from and to mobile stations withing its geographical service area
  • packet routing and transfer, mobility management, logical link management, authentication and charging functions

GGSN = Gateway GPRS Support Node

  • main component of the GPRS network
  • inter-networking between the GPRS network and external packet switched networks
  • router to a sub-network

AT commands

Huawei, Android

  • at+cgmi - manufacturer
  • at+cgmm - model
  • at+cimi - IMSI
  • at+cmgw="0914123456",145,"STO UNSENT" - store message to memory
  • at+cmgl="all" - show stored messages
  • at+cmss=3 - send message n. 3 from memory
  • at+cmgd=2 - delete message n. 2 from memory

Links

General

AT commands

Hacking

PDUSpy

Books

  • M. Grayson et al.: IP Design for Mobile Networks (Cisco Press, 2009)
  • A. Henry-Labordere, V. Jonack: SMS and MMS Interworking in Mobile Networks (Artech House, 2004)

2015-01-22

Scan and Compare Versions of Your Network Services

One of the most effective, although often neglected, ways to keep your servers secure are regular updates of running services (daemons). If you are running more than one server, it might be useful to compare the version of the services running on the hosts. If you find any differences you should consider upgrading to the newest version.

To achieve this, we are going to use the extremely useful scanning tool called nmap. We will call it and parse its output from a Perl script using Nmap::Parser module.

Here goes the script:

#!/usr/bin/perl
use strict;
use warnings;
use Nmap::Parser;

die "Usage: $0 host1 [host2 host3 ...]\n" unless @ARGV;

listNetServices(@ARGV);

=head2 listNetServices( @hosts )

Are we running different versions of network services (ex. SSH) on different
hosts? Useful for identifying unpatched (old) versions of network services but
we have to include a host with patched services into @hosts.

=cut

sub listNetServices {
    my @hosts = @_;

    my $services;    # HoH

    # Anonymous subroutine
    my $nmap = sub {
        my $host = shift;    #Nmap::Parser::Host object, just parsed

        for my $port ( $host->tcp_ports('open') ) {

            # Nmap::Parser::Host::Service object
            my $svc = $host->tcp_service($port);

            my $service = join( ' | ',
                $svc->name    // '',
                $svc->product // '',
                $svc->version // '' );

            push @{ $services->{$port}{$service} },
              $host->hostname . ' (' . $host->addr . ')';

        }
    };

    my $np = new Nmap::Parser;
    $np->callback($nmap);
    $np->parsescan( '/usr/bin/nmap', '-sV', @hosts );

    # Print report
    for my $port ( sort keys %$services ) {

        my $n_versions = keys %{ $services->{$port} };
        next unless $n_versions > 1;

        print "$port - $n_versions different versions on this port\n";

        for my $version ( sort keys %{ $services->{$port} } ) {
            print ' ' x 4 . $version . "\n";
            for my $host ( sort @{ $services->{$port}{$version} } ) {
                print ' ' x 8 . $host . "\n";
            }
        }
    }

    return;
}

(Up-to-date source of the ListNetServices function can be found in MyUtils.)

And this is a sample output suggesting that host1 is running older versions of both ssh and http daemons:

$ perl script-listed-above host1 host2
22 - 2 different versions on this port
    ssh | OpenSSH | 5.5p1 Debian 6+squeeze5
        host1 (1.2.3.4)
    ssh | OpenSSH | 6.7
        host2 (5.6.7.8)
80 - 2 different versions on this port
    http | Apache httpd | 2.2.16
        host1 (1.2.3.4)
    http | Apache httpd | 2.4.10
        host2 (5.6.7.8)