AfNOG 2010 Network Management Tutorial Smokeping Notes: ------ * Commands preceded with "$" imply that you should execute the command as a general user - not as root. * Commands preceded with "#" imply that you should be working as root. * Commands with more specific command lines (e.g. "RTR-GW>" or "mysql>") imply that you are executing commands on remote equipment, or within another program. Information ----------- pcN means replace the "N" with the proper number for your PC. Your pcs are numbered like this: row 1 = pc1-10 row 2 = pc33-34 row 3 = pc65-72 row 4 = pc97-106 row 5 = pc129-137 This information is available in the class network diagram: http://noc/network-diagram.html Exercises ---------- 1. Install Smokeping $ sudo apt-get install smokeping 2. Initial Configuration $ cd /etc/smokeping No changes are necessary in this page. You can update the Smokeping look and feel by editing /etc/smokeping/basepage.html $ cd /etc/smokeping/config.d $ ls -l -rwxr-xr-x 1 root root 578 2010-02-26 01:55 Alerts -rwxr-xr-x 1 root root 237 2010-02-26 01:55 Database -rwxr-xr-x 1 root root 413 2010-02-26 05:40 General -rwxr-xr-x 1 root root 271 2010-02-26 01:55 pathnames -rwxr-xr-x 1 root root 859 2010-02-26 01:55 Presentation -rwxr-xr-x 1 root root 116 2010-02-26 01:55 Probes -rwxr-xr-x 1 root root 155 2010-02-26 01:55 Slaves -rwxr-xr-x 1 root root 8990 2010-02-26 06:30 Targets The files you need to touch (at a minimum) are: * Alerts * General * Probes * Targets Edit Alerts $ sudo vi Alerts Update the top of the file where it says: *** Alerts *** to = alerte@address.somewhere from = smokealert@company.xy to include a proper "to" and "from" field for your server. Something like: *** Alerts *** to = afnog@localhost from = smokeping-alert@pcN.mgmt.ws.afnog.org If you were going to create tickets from Smokeping alerts the "to" address would be an alias for the ticketing system. We will do this a bit later. Add a new alert for later use: +rttbadstart type = rtt # in milliseconds pattern = ==S,==U priority = 1 comment = offline at startup * "==S, ==U" means "at Startup" and "not Up" * "priority = 1" means if multiple alerts are defined for a host and multiple alerts match one the one with the highest priority is executed. Now save the file and exit, then edit the file General: $ sudo vi General Change the following lines: owner contact cgiurl mailhost Something like this should work: owner = AfNOG 2010 User contact = afnog@pcN.mgmt.ws.afnog.org cgiurl = http://localhost/cgi-bin/smokeping.cgi mailhost = localhost Now save the file and exit, then edit the file Probes: $ sudo vi Probes The current entry in Probes is fine, but if you wish to use additional Smokeping checks you can add them in here and you can specify their default behavior. You can do this, as well, in the Targets file if you wish. Here is an example of a Probes file that would specify what to use to check for HTTP and DNS latency as well as the FPing probe that is used for ping latency: *** Probes *** + FPing binary = /usr/bin/fping + EchoPingHttp + DNS binary = /usr/bin/dig pings = 5 step = 180 Go ahead and update your Probes file with this information. Then save the file and exit. And, now let's restart the Smokeping service to verify that no mistakes have been made before going any further: $ sudo /etc/init.d/smokeping stop $ sudo /etc/init.d/smokeping start You could, also do: $ sudo /etc/init.d/smokeping restart or $ sudo /etc/init.d/smokeping reload to reload configuration changes. This should work in most cases. 3. Configure monitoring of devices The majority of your time and work configuring Smokeping will be done in the file /etc/smokeping/config.d/Targets. For this class please do the following: Use the FPing probe to check: - all the student PCs - classroom NOC - switches - routers: Create some hierarchy to the Smokeping menu for your checks. Such as: NOC PCs Routers Switches Add a check for HTTP latency for all the classroom PCs. This will mean adding another category, such as: PCs HTTP If you have time consider checking some machines that are external to our classroom and the conference. Look at additional Smokeping probes and consider implementing some of them: http://oss.oetiker.ch/smokeping/probe/index.en.html As trying to explain all syntactical details of how the file /etc/smokeping/config.d/Targets is used would require several pages we will go through some examples in class, and you can refer to the Smokeping configuration files that are in use on the classroom NOC box by going to: http://noc/configs/etc/smokeping http://noc/configs/etc/smokeping/config.d 4. Add DNS Latency Checks You can check either or both internal or external names using the DNS latency probe. Add a menu hierarchy for DNS Latency. Check an external address (nsrc.org) and an internal address (noc). This will look something like this (in Targets): $ sudo vi /etc/smokeping/config.d/Targets ++ DNS probe = DNS menu = External DNS Check title = DNS Latency +++ nsrc host = nsrc.org +++ noc host = noc.mgmt.ws.afnog.org Exit and save your changes to the file Targets. Restart Smokeping to see the changes: $ sudo /etc/init.d/smokeping restart 5. Send Smokeping alerts to our Request Tracker Net queue We've already set up RT up on our NOC machine. You just need o point Smokeping alerts to our RT instance. Edit the file Alerts: $ sudo vi /etc/smokeping/config.d/Alerts And change: to = tldadmin@localhost to to = net@noc.mgmt.ws.afnog.org Now whenever Smokeping sends an alert email with that alert text will arrive to the Net queue in Request Tracker. Next, be sure you have alerts defined for some of your Targets. You can either turn on alerts by defining alerts for a probe in the /etc/smokeping/config.d/Probes file, or by individual Targets entries. In our case let's edit the Targets file and turn on alerts for our DNS Latency checks. In addition, if you add a DNS latency check for a host that does not exist, then we can see a ticket being created in RT. $ sudo vi /etc/smokeping/config.d/Targets Find the following section in the file: ++ DNS probe = DNS menu = External DNS Check title = DNS Latency +++ nsrc host = nsrc.org +++ noc host = noc.mgmt.ws.afnog.org And, add the following host after "+++ noc" +++ noexist host = does.not.exist alerts = rttbadstart Save and exit from the file, then restart smokeping: $ sudo /etc/init.d/smokeping restart You will see an error message on the screen: WARNING: Hostname 'does.not.exist' does currently not resolve to an IPv6 or IPv4 address This is to be expected as the host "does.not.exist" is not a valid host. But, Smokeping still starts, and the rttbadstart Alert will now send email to the Net queue for Request Tracker. If you open a web browser to view our classroom RT installation: http://noc/rt/ and log in as user "inst" you will see a new ticket in the home screen that has a subject line something like: "[SmokeAlert] rttbadstart is active on pcN.DNSProbe.RT-test" 6. MultiHost Graphs Once you have defined a group of hosts under a single probe type in your /etc/smokeping/config.d/Targets file, then you can create a single graph that will show you the results of all smokeping tests for all hosts that you define. This has the advantage of letting you quickly compare, for example, a group of hosts that you are monitoring with the FPing probe. To create a MultiHost graph first edit the file Targets: $ sudo vi Targets If you had a section for the FPing probe defined that looked like this (this is an example only - your Targets file may look different): + Local menu = Local title = Local Network ++ LocalMachine menu = Local Machine title = This host host = localhost ++ pc1 menu = pc1.mgmt title = pc1, row1 host = pc1.mgmt.ws.afnog.org ++ pc2 menu = pc1.mgmt title = pc1, row1 host = pc1.mgmt.ws.afnog.org ++ pc3 menu = pc1.mgmt title = pc1, row1 host = pc1.mgmt.ws.afnog.org etc... Right now smokeping displays the results of the FPing probe for each host defined in separate graphs. If you wish to see the results in a single graph with multiple lines, then you would do this after the last FPing probe host definition: + MultiHostPCs menu = MultiHost Ping title = Consolidated Ping Response Time host = /Local/LocalMachine /Local/pc1 /Local/pc2 \ /Local/pc3 Note how you can have multiple lines for the "host" entry by using the "\" character to indicate another line. Now save and exit the file Targets and restart smokeping: $ sudo /etc/init.d/smokeping restart You should see a new graph under the "MultiHost Ping" menu in your smokeping web interface. This graph will have different color lines for each host you have defined. 7. Slave instances This is a description only for informational purposes in case you wish to attempt this type of configuration once the workshop is over. The idea behind this is that you can run multiple smokeping instances at multiple locations that are monitoring the same hosts and/or services as your master instance. The slaves will send their results to the master server and you will see these results side-by-side with your local results. This allows you to view how users outside your network see your services and hosts. This can be a powerful tool for resolving service and host issues that may be difficult to troubleshoot if you only have local data. Graphically this looks this: [slave 1] [slave 2] [slave 3] | | | +-------+ | +--------+ | | | v v v +---------------+ | master | +---------------+ You can see example of this data here: http://oss.oetiker.ch/smokeping-demo/ Look at the various graph groups and notice that many of the graphs have multiple lines with the color code chart listing items such as "median RTT from mipsrv01" - These are not MultiHost graphs, but rather graphs with data from external smokeping servers. To configure a smokeping master/slave server you can see the documentation here: http://oss.oetiker.ch/smokeping/doc/smokeping_master_slave.en.html In addition, a sample set of steps for configuring this is available in the file sample-smokeping-master-slave.txt located in: http://noc/files/