Why a memory test script?

Linux : 2015. 11. 25. 11:20
반응형

출처 : http://people.redhat.com/dledford/memtest.shtml



memtest.sh



Why a memory test script?

As it turns out, lots of the dedicated memory test programs that you can run on an Intel computer are not all that good. The problem is, they attempt to test the memory in your computer by beating on it with the CPU. Unfortunately, in real life, the CPU isn't the only thing that beats on your memory. DMA based IDE drives, DMA based SCSI transfers, almost all modern PCI controller cards, etc. all use direct memory access to transfer data in and out of the machine. This takes place totally outside of the CPU, and in parallel with CPU operations. So, as a result, under a real system load, part of your memory bandwidth is consumed by the CPU and part of it is consumed by these DMA operations. As it turns out, the CPU isn't typically fast enough to place the same load on your memory that this combination of CPU and DMA does no matter what program you run. Or, in other cases, the memory access queue built into the memory controller in the chipset may be at fault. This script doesn't really attempt to root cause the problem, it just tests it. 

It turns out there is a second use for this script as well. Once you start talking about keeping the local hard drives busy 100% of the time and also keeping the CPU chunking along, you're already very close to performing a decent test of your machine's power supply capacity versus maximum power demands. All you really need to do is throw in some 3D operations if you have a 3D video card, maybe burn a DVD, and shoot some network packets out at the same time and you've pretty much maxed out your power requirements (this assumes that in a multi-disk machine we are beating up all the hard drives, not just one). If someone wanted to send me patches to do these things in the script, I'd gladly take them, but even without that, you could just start this script, and maybe glxgears, and netperf, and then see what happens. 

However, unlike the failure mode for just doing the memory test, a power supply failure can result in totally random problems, including hard drives simply spinning down, total hard locks of the computer, video glitches on screen, failure during a DVD burn, etc. If you run this test in power supply test mode, then if your computer does anything out of the ordinary that it shouldn't do, you could have a power supply problem. Of course, repeatability of the failure is important. I don't want to recommend people go replace a power supply because they ran this test and they heard a pop of static come out of their audio device during the test on a single run. Make sure that the machine actually fails reliably before using this test as a reason to go buy (or demand replacement of) your power supply. 

What do you need?

You will need the shell script below (which only works with fairly recent versions of the bash shell) and you will need a linux kernel (or other suitable) tar.bz2 file. I recommend the tarball be compressed with bzip2, not gzip, and that you have the parallel bzip2 (pbzip2) installed on your machine. You will also want to run the test in a directory on local disks unless you have both a network and file server capable of reading/writing files faster than your hard drives (not so unlikely as you might think these days...a file server with a fast disk array that's accessible over 10GBit ethernet or InfiniBand will out run a single local hard disk). In addition, if you have multiple local hard drives, then it would be best if they were all used in the test. Whether that's by the script running on a software raid that spans all the disks or by running the test separately on each disk doesn't matter so much, but sorting out how to hit all the disks in your machine is more code than I wanted to put into the script, so you'll need to decide how to make it happen yourself and act accordingly. 

Here's the shell script (Right click and Save As to get the script without copy-n-pasting)

#!/bin/bash
#
# memtest
#
# A general purpose memory tester script, with options for enabling extra
# simultaneous tests in an attempt to test the capacity of the power supply
# in the machine.  You can find the original source of this script and
# additional documentation on its usage by visiting my work web page at
# http://people.redhat.com/~dledford/
#
# Author: Doug Ledford  + contributors
#
# (C) Copyright 2000-2002,2008,2012 Doug Ledford; Red Hat, Inc.
# This shell script is released under the terms of the GNU General
# Public License Version 3, June 2007.  If you do not have a copy
# of the GNU General Public License Version 3, then one may be
# retrieved from http://people.redhat.com/~dledford/gpl.txt
#

# Here we set the defaults for this script.  You can edit these to
# make permanent changes, or you can pass in command line arguments to
# make temporary changes
TEST_DIR=/tmp
SOURCE_FILE=linux.tar.bz2
NR_PASSES=20
HELP=
PARALLEL=no
EXTRACT=yes
MB=1048576

usage ()
{
	echo "Usage: `basename $0` [options]"
	echo "Options:"
	echo "	-u	Display this usage info"
	echo "	-h	Give full help description of each option"
	echo "	-t "
	if [ -n "$HELP" ]; then
	echo "		The directory where we will unpack the tarballs and run"
	echo "		the diff comparisons.  Ideally, this will hit all local"
	echo "		disks by being part of a software raid array (if there"
	echo "		is more than one local disk anyway).  It can also be on"
	echo "		a network file server if you have both a very fast disk"
	echo "		array on the server, and a very fast network inter-"
	echo "		connect between the server and your machine (10GBit/s"
	echo "		or faster is needed in order to out run a single modern"
	echo "		SATA hard drive that's local)."
	fi
	echo "		Default: /tmp"
	echo "	-s "
	if [ -n "$HELP" ]; then
	echo "		The file we will decompress and untar to create the"
	echo "		file trees we intend to compare against each other.  I"
	echo "		recommend that this file be a tar.bz2 file and that you"
	echo "		have the parallel bzip2 utility (pbzip2) installed on"
	echo "		your machine.  This at least allows the bzip2 unzip"
	echo "		operation to use all CPUs when not running the test in"
	echo "		parallel mode.  This file is expected to be in the test"
	echo "		directory.  You can use a relative path from the test"
	echo "		directory in case it is located somewhere else."
	fi
	echo "		Default: linux.tar.bz2"
	echo "	-x	Toggle whether or not we extract the contents of the"
	echo "		tarball, or just decompress it."
	if [ -n "$HELP" ]; then
	echo "		In the event that extracting lots of small files slows"
	echo "		your disk subsystem down too much and our overall write"
	echo "		speed falls too low, it won't make an effective test of"
	echo "		the disk DMA operations against CPU memory operations."
	echo "		If your write speeds aren't high enough, try disabling"
	echo "		extraction and see if that speeds things up."
	fi
	echo "		Default: extract the contents"
	echo "	-n "
	if [ -n "$HELP" ]; then
	echo "		How many times we will run the test before we consider"
	echo "		it complete.  It's possible to pass most of the time"
	echo "		and only fail once in a while, so we run the whole"
	echo "		thing multiple times by default."
	fi
	echo "		Default: 20"
	echo "	-m "
	if [ -n "$HELP" ]; then
	echo "		How many megabytes does the uncompressed tarball use on"
	echo "		the filesystem.  We assume a roughly 75% compression"
	echo "		ratio in the compressed tarball if the compression used"
	echo "		is gzip, and 80% if the compression is bzip2.  So we"
	echo "		take 4 or 5 times the size of the file depending on the"
	echo "		file extension."
	fi
	echo "		Default: 4 * sizeof source .gz files,"
	echo "			 5 * sizeof source .bz2 files"
	echo "	-c "
	if [ -n "$HELP" ]; then
	echo "		We normally just calculate how many copies of the"
	echo "		tarball to extract in order to be roughly 1.5 times"
	echo "		physical memory size, but you can pass in a number if"
	echo "		this doesn't work for some reason."
	fi
	echo "		Default: physical ram size * 1.5 / megs_per_copy"
	echo "	-p	Toggles whether or not we will extract and diff all the"
	echo "		source trees in parallel or serial."
	if [ -n "$HELP" ]; then
	echo "		Most linux filesystems will be much faster when running"
	echo "		tests serially.  It tends to bog the drives down when"
	echo "		the heads have to seek back and forth a lot to satisfy"
	echo "		multiple readers/writers acting simultaneously.  But,"
	echo "		parallel operations will tend to create much higher"
	echo "		memory pressure and can be useful testing the virtual"
	echo "		memory subsystem, so it's available as an option here."
	fi
	echo "		Default: serial"
	echo "	-i	Just parse the arguments and say what figures we came"
	echo "		up with and then exit."
	exit 1
}

clean_exit()
{
	# Kill any children that might be in the background, as well as any
	# currently running foreground apps, then cleanup, then exit
	for job in `jobs -p`; do
		kill -9 $job >/dev/null 2>&1
		while [ -n "`ps --no-heading $job`" ]; do sleep .2s; done
	done
	echo -n "Waiting for all pipelines to exit..."
	wait
	echo "done."
	echo -n "Cleaning up work directory..."
	rm -fr memtest-work
	echo "done."
	popd
	exit $1
}

trap_handler()
{
	echo " test aborted by interrupt."
	clean_exit 1
}

while [ -n "$1" ]; do
	case "$1" in
	-u)
		USAGE=1
		shift
		;;
	-h)
		HELP=1
		shift
		;;
	-t)
		TEST_DIR="$2"
		shift 2
		;;
	-s)
		SOURCE_FILE="$2"
		shift 2
		;;
	-x)
		[ $EXTRACT = yes ] && EXTRACT=no || EXTRACT=yes
		shift
		;;
	-n)
		NR_PASSES="$2"
		shift 2
		;;
	-m)
		MEGS_PER_COPY="$2"
		shift 2
		;;
	-c)
		NR_COPIES="$2"
		shift 2
		;;
	-p)
		[ $PARALLEL = yes ] && PARALLEL=no || PARALLEL=yes
		shift
		;;
	-i)
		JUST_INFO=1
		shift
		;;
	*)
		echo "Unknown option $1"
		USAGE=1
		shift
		;;
	esac
done

[ -n "$USAGE" ] && usage

if [ ! -f "$TEST_DIR/$SOURCE_FILE" ]; then
  echo "Missing source file $TEST_DIR/$SOURCE_FILE"
  usage
fi

BZIP2=`file -b "$TEST_DIR/$SOURCE_FILE" | grep bzip2`
if [ -n "$BZIP2" ]; then
	COMPRESS_RATIO=6
	COMPRESS_PROG=`which pbzip2 2>/dev/null`
	[ -z "$COMPRESS_PROG" ] && COMPRESS_PROG=`which bzip2 2>/dev/null`
else
	COMPRESS_RATIO=4
	COMPRESS_PROG=`which gzip 2>/dev/null`
fi

# Guess how many megs the unpacked archive is.
if [ -z "$MEGS_PER_COPY" ]; then
  ARCHIVE_SIZE_MB=$(ls -l --block-size=$MB "$TEST_DIR/$SOURCE_FILE" | awk '{ print $5 }')
  EXTRACTED_SIZE_MB=$(echo "$ARCHIVE_SIZE_MB * $COMPRESS_RATIO" | bc)
fi

# How many trees do we have to unpack in order to make our trees be larger
# than physical RAM?  We shoot for 1.5 times physical RAM size just to be
# sure we unpack plenty and to compensate in case our estimate of unpacked
# size is inaccurate. 
if [ -z "$NR_COPIES" ]; then
  MEM_TOTAL_MB=$(free -m | awk '/^Mem:/ { print $2 }')
  NR_COPIES=$(echo "$MEM_TOTAL_MB.000 * 1.500 / $EXTRACTED_SIZE_MB" | bc)
  MIN_FREE_DISK=$[ $MEM_TOTAL_MB + $MEM_TOTAL_MB ]
fi

# Check for disk free space and bail if we don't have enough
DISK_FREE_MB=$(df -B$MB $TEST_DIR | awk '!/^Filesystem/{ printf $2 }')
DISK_FS_TYPE=$(df -T $TEST_DIR | awk '!/^Filesystem/{ print $2 }')

if [ $MIN_FREE_DISK -gt $DISK_FREE_MB ]; then
	echo "Error: Not enough free disk space in test directory"
	echo "	Based on memory size of machine, you need at least"
	echo "	$MIN_FREE_DISK MB of free space and you only have"
	echo "	$DISK_FREE_MB MB at the moment.  Please free up"
	echo "	disk space or set TEST_DIR to a directory on a"
	echo "	filesystem that has enough free space to run the test."
	echo
	JUST_INFO=1
fi

echo "TEST_DIR:		$TEST_DIR"
echo "DISK_FREE_MB:			$DISK_FREE_MB"
echo "DISK_FS_TYPE:			$DISK_FS_TYPE"
echo "SOURCE_FILE:			$SOURCE_FILE"
echo "COMPRESS_PROG:				$COMPRESS_PROG"
echo "COMPRESS_RATIO:				$COMPRESS_RATIO"
echo "ARCHIVE_SIZE_MB:			$ARCHIVE_SIZE_MB"
echo "EXTRACTED_SIZE_MB:			$EXTRACTED_SIZE_MB"
echo "MEM_TOTAL_MB:		$MEM_TOTAL_MB"
echo "NR_COPIES:			$NR_COPIES"
echo "NR_PASSES:		$NR_PASSES"
echo "PARALLEL:		$PARALLEL"
echo "EXTRACT:		$EXTRACT"
echo
if [ -n "$JUST_INFO" ]; then
  exit 0
fi

# OK, options parsed and sanity tests passed, here starts the actual work
pushd $TEST_DIR

# Set our trap handler
trap 'trap_handler' 2 9 15

# Remove any possible left over directories from a cancelled previous run
rm -fr memtest-work

# Unpack the one copy of the source tree that we will be comparing against
echo -n "Creating comparison source..."
if [ $EXTRACT = yes ]; then
  mkdir -p memtest-work/original
  $COMPRESS_PROG -dc $SOURCE_FILE 2>/dev/null | tar -xf - -C memtest-work/original >/dev/null 2>&1 &
  wait
  if [ $? -gt 128 ]; then
    clean_exit 1
  fi
else
  mkdir -p memtest-work
  $COMPRESS_PROG -dc $SOURCE_FILE > memtest-work/original 2>/dev/null &
  wait
  if [ $? -gt 128 ]; then
    clean_exit 1
  fi
fi
echo "done."

i=1
while [ "$i" -le "$NR_PASSES" ]; do
  echo -n "Starting test pass #$i: "
  j=0
  echo -n "unpacking"
  while [ "$j" -lt "$NR_COPIES" ]; do
    if [ $PARALLEL = yes ]; then
      if [ $EXTRACT = yes ]; then
        mkdir -p memtest-work/$j
	$COMPRESS_PROG -dc $SOURCE_FILE 2>/dev/null | tar -xf - -C memtest-work/$j >/dev/null 2>&1 &
      else
        $COMPRESS_PROG -dc $SOURCE_FILE > memtest-work/$j 2>/dev/null &
      fi
    else
      if [ $EXTRACT = yes ]; then
	mkdir -p memtest-work/$j
	$COMPRESS_PROG -dc $SOURCE_FILE 2>/dev/null | tar -xf - -C memtest-work/$j >/dev/null 2>&1 &
	wait
	if [ $? -gt 128 ]; then
	  clean_exit 1
	fi
      else
	$COMPRESS_PROG -dc $SOURCE_FILE > memtest-work/$j 2>/dev/null &
	wait
	if [ $? -gt 128 ]; then
	  clean_exit 1
	fi
      fi
    fi
    j=$[ $j + 1 ]
  done
  if [ $PARALLEL = yes ]; then
    wait
    if [ $? -gt 128 ]; then
      clean_exit 1
    fi
  fi
  j=0
  echo -n ", comparing"
  while [ "$j" -lt "$NR_COPIES" ]; do
    if [ $PARALLEL = yes ]; then
      if [ $EXTRACT = yes ]; then
        diff -uprN memtest-work/original memtest-work/$j &
      else
        cmp memtest-work/original memtest-work/$j &
      fi
    else
      if [ $EXTRACT = yes ]; then
        diff -uprN memtest-work/original memtest-work/$j &
	wait
	if [ $? -gt 128 ]; then
	  clean_exit 1
	fi
      else
        cmp memtest-work/original memtest-work/$j &
	wait
	if [ $? -gt 128 ]; then
	  clean_exit 1
	fi
      fi
    fi
    j=$[ $j + 1 ]
  done
  if [ $PARALLEL = yes ]; then
    wait
    if [ $? -gt 128 ]; then
      clean_exit 1
    fi
  fi
  j=0
  echo -n ", removing"
  while [ "$j" -lt "$NR_COPIES" ]; do
    rm -fr memtest-work/$j &
    wait
    if [ $? -gt 128 ]; then
      clean_exit 1
    fi
    j=$[ $j + 1 ]
  done
  echo ", done."
  i=$[ $i + 1 ]
done

clean_exit 0



    



How do you know if your memory passed?

Very simple. If you run that script from the command line on your computer and it completes without ever spewing a single error onto your screen, then you passed. If you get messages from diff about differences between files or any other anomolies such as that, then you failed.

Here's a good run:

[dledford@firewall ~]$ memtest.sh -x -c 1 -n 2
TEST_DIR:		/tmp
SOURCE_FILE:		linux.tar.bz2
NR_PASSES:		2
MEGS_PER_COPY:		240
NR_COPIES:		1
PARALLEL:		no
COMPRESS_RATIO:		5
COMPRESS_FLAG:		j
COMPRESS_PROG:		/usr/bin/pbzip2
EXTRACT:		no

Creating comparison source...done.
Starting test pass #1: unpacking, comparing, removing, done.
Starting test pass #2: unpacking, comparing, removing, done.
Starting test pass #3: unpacking, comparing, removing, done.
Starting test pass #4: unpacking, comparing, removing, done.
Starting test pass #5: unpacking, comparing, removing, done.
[dledford@firewall ~]$ 
    


반응형

'Linux' 카테고리의 다른 글

Hotplugging with udev  (0) 2015.12.07
u-boot ramdisk boot 연습  (0) 2015.11.20
Yocto 프로젝트 소프트웨어  (0) 2015.11.11
Posted by Real_G