Posted by: reformedmusings | January 10, 2009

Monitor your hard drives in Ubuntu 8.10 Intrepid

Your hard drives store your precious data, yet few folks monitor their drives’ condition and health with regularity. Most important would be temperature, but internal errors and drive Self-Monitoring, Analysis and Reporting Technology System (S.M.A.R.T.) onboard logs bear periodic review as well. It is possible to do all this in Linux.

Although this post will be written from an Ubuntu 8.10 Intrepid point of view, none of these programs are specific to Ubuntu. They should work in most versions of Linux, though how you install them may vary. Whether they support your hardware is a different matter. If your hard drives support S.M.A.R.T., what I lay out below should work.

lm-sensors

A good place to start is with lm-sensors. I have a post dedicated to its installation here. While that post was written for Hardy, it works the same in Intrepid. For a graphical interface, install ksensors in KDE or sensors-applet in Gnome. Both allow you to put sensor information on your panels. Since I covered Kubuntu last time, I’ll do Ubuntu here. First, load lm-sensors and the required support packages:

sudo aptitude install lm-sensors i2c-tools read-edid sensord hddtemp sensors-applet

These should automatically offer to install the support files libsensors3, libsensors4, and libsensors-applet-plugin0 as well. Note that you can also install these using a GUI package manager like Add/Remove or Synaptic Package Manager. After these install, run the detection script in the terminal:

sudo sensor-detect

Answer ‘y’ to every question, including ‘yes’ to adding a section to /etc/modules. When the script completes, restart the system.

You can check to ensure everything is working by typing “sensors” at the terminal (without the quotes) to obtain:

f71882fg-isa-0a00
Adapter: ISA adapter
3.3V:        +3.36 V
Vcore:       +1.16 V  (max =  +2.04 V)
Vdimm:       +1.68 V
Vchip:       +1.23 V
+5V:         +5.00 V
12V:        +14.37 V
5VSB:        +4.79 V
3VSB:        +3.36 V
Battery:     +3.09 V
CPU:        2636 RPM
System:        0 RPM  ALARM
Power:         0 RPM  ALARM
Aux:           0 RPM  ALARM
CPU:         +29.0°C  (high = +75.0°C, hyst = +71.0°C)
(crit = +75.0°C, hyst = +71.0°C)  sensor = transistor
System:      +35.0°C  (high = +85.0°C, hyst = +81.0°C)
(crit = +100.0°C, hyst = +96.0°C)  sensor = transistor

coretemp-isa-0000
Adapter: ISA adapter
Core 0:      +48.0°C  (high = +78.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +45.0°C  (high = +78.0°C, crit = +100.0°C)

You can do this anytime to check your entire system at a glance.

Right click on a blank section of the panel with the system tray on it (or on whatever panel you want the applet), and select “Add to Panel…”, then add the Hardware Sensors Monitor and click on the Add button as below:

hdd-sensor-applet1When it appears on your panel, right-click it and select Preferences to get to the setup screens, select the Sensors tab:

hdd-panel-setup1

The sensors come divided into three categories: nvidia, libsensors, and hddtemp. I have the GPUCoreTemp selected for display, as well as the consistently warmest of my two SATA drives in the RAID 1 array (discovered by monitoring them over time). Under libsensors, I also display the CPU core and general system temperature. You can set the high/low and alarms for each sensor by selecting a sensor and then clicking on the properties button. You can also change the icon displayed for each item. The net result for me looks like this:

hdd-panel

From left to right, the GPU, CPU (with a honkin’ Zalman 9700 fan), system, and SATA drive temperatures in Celsius. This enables me to continuously monitor the critical temperatures in my system.

hddtemp

You can also display the temperatures of all your hard drives using hddtemp in the terminal. In Intrepid, all my drives designate as /dev/sdx, where x=a|b|c|etc. In other distributions/versions, some or all may show up as hdx (same scheme). You could check a single drive from the terminal by using its designation, such as:

sudo hddtemp /dev/sda

which will yeild:

/dev/sda: WDC WD3200JB-00KFA0: 23°C

The easy way to see all the temperatures together is to use the wildcard character ‘?’:

sudo hddtemp /dev/sd?

Which provides an output like:

/dev/sda: WDC WD3200JB-00KFA0: 23°C
/dev/sdb: IC35L060AVER07-0: 33°C
/dev/sdc: IOMEGA  ZIP 250       ATAPI             : S.M.A.R.T. not available
/dev/sdd: ST3320620AS: 35°C
/dev/sde: ST3320620AS: 34°C
/dev/sdf: KingstonDataTraveler 2.0PMAP: S.M.A.R.T. not available
/dev/sdg: EPSON Stylus Storage: S.M.A.R.T. not available
/dev/sdh: Maxtor OneTouch: S.M.A.R.T. not available

Note that the devices that don’t have S.M.A.R.T. support – the external USB drive, ZIP drive, USB stick, and card reader in the printer – simply report as unsupported. The hard drives report back their current temperatures. As an aside, this is also a quick way to find out how your physical drives are mapped. Also note that you can only do this on the underlying physical drives in a RAID array. Don’t try to check a mapped RAID array directly, as you’ll get back an error:

ERROR: /dev/mapper/nvidia_figedjie: can’t determine bus type (or this bus type is unknown)

Typing “hddtemp –help” in the terminal will display a list of other things that you can do with it.

smartmontools

But that’s not all that you can learn about the hard drives. To go further and monitor the S.M.A.R.T.  information. We have to install a few more applications from the repository (and again, you can use Add/Remove if desired):

sudo aptitude install smartmontools smart-notifier

The smartmontools package holds two applications, smartctl and smartd. They control and monitor hard disks that include the S.M.A.R.T. capability.  smart-notifier only works under gtk.

You can get the detailed drive information as follows:

sudo smartctl -i /dev/sdd

Which in this case yields:

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10 family
Device Model:     ST3320620AS
Serial Number:    6QF0ZCSB
Firmware Version: 3.AAK
User Capacity:    320,072,933,376 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri Jan  9 23:33:29 2009 EST
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

In order to see all of the S.M.A.R.T. information in a drive, type this in the terminal:

sudo smartctl –all /dev/sda

I won’t duplicate the output here because it is quite extensive. There are some interesting lines,  including the temperatures (including lifetime high and low), power-on time, and power cycles. In my case, the SATA drives have been running for a total of 11,625 hours and have only been power-cycled 39 times in over a year. I rarely ever power down, and a few of those are due to local power outages. Also, the high/low lifetime temperatures show amazing consistency at 33/35 and 34/36 on the two SATAs. My somewhat older WD Caviar drive has 17,201 hours of runtime, 89 power cycles, and but didn’t provide a temperature spread.

Additionally, S.M.A.R.T. reports the number of unrecoverable errors. So hopefully you see these lines:

SMART Error Log Version: 1
No Errors Logged

Gotta like that! You can also get an overall drive health assessment with:

sudo smartctl –health /dev/sdd

Which hopefully gives an answer like this:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

More good news!

You can use smartctl to do a number of things, including initiate self-tests of the drives and turn S.M.A.R.T. on if it isn’t already. “sudo smartctl –help” lays out all the options.

smart-notifier

Moving on, smart-notifier provides a system service that monitors the S.M.A.R.T. messages and will attempt to report drive errors to the user. After installation, go to System -> Preferences -> Sessions:

hdd-notifier

Scroll down to Smart Notifier and check it, then Close. To activate it without a restart, go to System -> Preferences -> Services, click on Unlock, put in your sudo password, scroll down to “Hardware monitor (smartnotifier)” and check it:

hdd-notify-service

Close and your good to go.

GSmartControl

There is a new, experimental, gtk GUI front-end for smartctl called GSmartControl. You won’t find it in the Ubuntu repositories, so will have to download it using a browser and install it yourself. There are downloads for various distributions here. Clicking on the proper download will bring up the download dialog in your browser. In Firefox under Ubuntu, pick “Open with…” and make sure that “GDebi Package Installer” is selected. Click on OK, and Firefox will download the file, then open GDebi. Click Install and you’re off to the races.

After the installation finishes, go to Applications -> System Tools -> GSmartControl:

hdd-ctl-gui

You can see and do some basic things on this screen, and more detailed stuff from the menus. If you right-click on a drive and select “View Details”, you can get to the detailed information for the drive:

hdd-gui-details

The GUI provides all the information that you can get from smartctl. Going through the tabs will display all the information you received from “smartctl –all [device]” for that drive. The GUI is nicely done. But again, it is still in beta status, so may have bugs. Even so, I didn’t find any problems, and the interface is nicely done.

Conclusion

So, now you have a number of approaches and tools with which to keep track of the health of your hard drives. With this information, you can tailor the airflow in your computer to optimize cooling for your drives. You can also keep track of errors, perhaps saving your data before something bad and irreversible happens. Still, good monitoring is no substitute for regular backups. When it comes to your precious data, be afraid, be very afraid.

Remember – just because you’re paranoid doesn’t mean that your hard drive won’t crash tomorrow…

Advertisements

Responses

  1. […] By digitaleagle I came across this post today.  It is something that I need to do, but probably will never get around to it.  Anyway, I […]

  2. […] we have concluded Smart Notifier is the most helpful tool in this situation. There is also a guide here that includes some details on how to get it […]


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: