Linux Memory Analysis with Free and Pmap Command

Channel: Linux
Abstract: THE PMAP COMMAND REPORTS THE MEMORY MAP OF A PROCESS OR PROCESSES.-D DEVICE SHOW THE DEVICE FORMAT.

When we talk of memory, then its the most misunderstood parameters of the whole system, especially, when we talk of Unix as a system

This article mainly talks of simple tools that can be used to assess memory utilization by a process or a system on a whole.

Two main areas are Kernel memory and User memory, now let's magnify these terms a bit.

Linux Device Drivers Part 4 - Major...

To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video

Linux Device Drivers Part 4 - Major number and Minor Number in Linux Device Driver Kernel Memory

Its the kernel managed memory that comprises of:

• Text — where only the read-only parts of the program are stored. This is usually the actual instruction code of the program. Several instances of the same program can share this area of memory.

• Static Data — the area where preknown memory is allocated. This is generally for global variables and static C++ class members. The operating system allocates a copy of this memory area for each instance of the program.

• Memory Arena (also known as break space) — the area where dynamic runtime memory is stored. The memory arena consists of the heap and unused memory. The heap is where all user-allocated memory is located. The heap grows up from a lower memory address to a higher memory address.

• Stack — whenever a program makes a function call, the current function's state needs to be saved onto the stack. The stack grows down from a higher memory address to a lower memory address. A unique memory arena and stack exists for each instance of the program.

User memory

It resides in the memory heap and is called by memory routines such as malloc(), realloc, free(), callo()
So memory plays an important role in the system performance and is one of the most critical component to be analyzed during system performance issue, we have lots of tools via which the system performance can be measured but sometimes they fail miserably, especially, when you are not checking a home grown lab but a critically running production box with 60G of RAM, multi-core CPUs and heavy application spawning millions of connections and thus creating performance bottleneck, in such scenarios, running a top command will definitely not reflect the actual memory usage and so in such scenarios, we usually don’t rely much on ps or top commands as they report the memory usage of process on the concept that it’s the only process running in the OS, but actually Linux also has some concepts of shared libs, so when we do a ps or top on a process to get the usage, it ignores other memory related stuff like shared/private sections that show the actual memory usage, this one liner below will help us find the exact stuff for running processes.

You shouldn't be happy always with these top one liners :

[redhat@localhost Desktop]$ ps -e -o user,pid,%cpu,%mem,rss,cmd --sort=-rss | head
USER PID %CPU %MEM RSS CMD
redhat 3114 4.2 1.1 21612 /usr/lib/vmware-tools/sbin32/vmtoolsd -n vmusr
root 2673 5.9 1.0 20032 /usr/bin/Xorg :0 -nr -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-WW9iey/database -nolisten tcp vt1
redhat 3090 5.3 0.9 18928 nautilus
redhat 3161 1.3 0.8 17444 /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=19
redhat 3207 3.4 0.7 14384 /usr/bin/gnome-terminal -x /bin/sh -c cd '/home/redhat/Desktop' && exec $SHELL
redhat 3162 1.2 0.6 13084 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf-ior-fd=28
redhat 3084 0.8 0.5 11620 gnome-panel
redhat 3146 0.7 0.5 11176 nm-applet --sm-disable
redhat 3133 0.4 0.5 10428 gnome-volume-control-applet
NB: The moral of this story is that process memory usage on Linux is a complex matter; you can't just run ps and know what is going on. This is especially true when you deal with programs that create a lot of identical children processes, like Java. Ps command might report that each Java process uses 100 megabytes of memory, when the reality might be that the marginal cost of each Java process is 10 megabyte of memory.

One line below will give you a detailed snapshot what’s going under the kernel bed sheets :

for i in `ps -eaf | grep java | grep -v grep | awk '{print $2}'`; do  echo -n "PID $i actual memory  usage is :" >> totaluse.txt; pmap -d $i | grep -i "writeable/private: " >> totaluse.txt; done

Now don't afraid via that one line as i will break it into pieces to explain each and every bit of it and then you can see how easy is to use such one liners in some of the most critical production scenarios, here I talked of pmap tool, let's introduce you to pmap and let's define it officially as per MAN page :

NAME
PMAP - REPORT MEMORY MAP OF A PROCESS

SYNOPSIS
PMAP [ -X | -D ] [ -Q ] PIDS...
PMAP -V

DESCRIPTION
THE PMAP COMMAND REPORTS THE MEMORY MAP OF A PROCESS OR PROCESSES.

GENERAL OPTIONS
-X EXTENDED SHOW THE EXTENDED FORMAT.
-D DEVICE SHOW THE DEVICE FORMAT.
-Q QUIET DO NOT DISPLAY SOME HEADER/FOOTER LINES.
-V SHOW VERSION DISPLAYS VERSION OF PROGRAM.

EXTENDED AND DEVICE FORMAT FIELDS
ADDRESS: START ADDRESS OF MAP
KBYTES: SIZE OF MAP IN KILOBYTES
RSS: RESIDENT SET SIZE IN KILOBYTES
DIRTY: DIRTY PAGES (BOTH SHARED AND PRIVATE) IN KILOBYTES
MODE: PERMISSIONS ON MAP: READ, WRITE, EXECUTE, SHARED, PRIVATE (COPY ON WRITE)
MAPPING: FILE BACKING THE MAP, OR ’[ ANON ]’ FOR ALLOCATED MEMORY, OR ’[ STACK ]’ FOR THE PROGRAM STACK
OFFSET: OFFSET INTO THE FILE
DEVICE: DEVICE NAME (MAJOR:MINOR)

So we can view the process map of any process and its actual memory consumption and not the vague average reported by top process , i will take a production scenario to explain my one liner above and other memory calculations based on memory tools like free.

Scenarios : its difficult to find the actual memory consumption as top is reporting memory usage is 98% utilized and free is also aligning with the top output with no process running in the system other than the system processes, what could be the reason and how to debug it.
Linux Philosophy

The philosophy in Linux is that an unused resource is a wasted resource. The kernel therefore will use as much RAM as it can to cache information from your local and remote filesystems and disks. This builds up over time as reads and writes are done on the system trying to keep the data stored in RAM as relevant as possible to the processes that have been running on your system. This caching is reported by the system as the sum of two numbers, buffers and pagecache. The cache is reclaimed, not at the time of process exit (you might start up another process soon that needs the same data), but upon demand - i.e. When you start a process that needs a lot of memory to run, the Linux kernel will reclaim memory that had been storing cached data and give it to the new process.

That said there are some things which are reported as 'cached' but are not directly available to the kernel:

1. anonymous mmaps, which are not backed by a file but by the swap area.

2. shared memory regions, both System V IPC and POSIX /dev/shm ones.

3. Some application servers and databases (e.g. SAP and Oracle DB) make usage of those shared memory facilities as a very convenient way of sharing data between multiple processes.

Although both anonymous mmaps and shared memory regions can be swapped out from memory to disk, so theoretically your applications could use that memory, your system is likely to experience performance issues should that occur.

Consequently, if your system is using any of the above facilities you need to keep in mind that not all the memory reported as 'cached' should be accounted as available for your applications.

These will however report against all processes attached to them, unlike normal cache which is not part of the address space of any running process but is simply a kernel mapping.

For example (Units are in megabytes):

# free -m
total used free shared buffers cached
Mem: 1000 900 100 0 350 350
-/+ buffers/cache: 200 800

In this example, as far as applications are concerned the system is using only 200MB of memory and has 800MB free and available for use if needed (as long as no anonymous memory maps or shared memory regions are in place).

Note: in this example,

Total Physical Memory = 1000 M

Physically Used Memory = 900 M

Actual used memory = 200 M

buffers = 350 M

cached = 350 M

Physically Free Memory = 100 M

Memory free for Applications = 800 M

The items to note here are:

<Physically Used Memory> = <Actual used memory> + <buffers> + <cache> = 200 + 350 + 350 = 900 M

<Physically Free Memory> = <Total Physical Memory> - <Actual used memory> - <buffers> - <cache> = 1000 - 200 - 350 - 350 = 100 M

<Memory free for Applications> = <Total Physical Memory> - <Actual used memory> = 1000 - 200 = 800 M

<Memory used  by Applications> = <Physically Used Memory> - <buffers> - <cache> = 900 - 350 - 350 = 200 M

The above scenario explains a perfect example of free tool usage, because understanding each and every parameter of the output can derive awesome conclusions and can reduce your total server upgrade expenditure capital on the same.

We can explain a bit of one-liner here now, in case the detailed process map is required for the above calculations, a little bit of bash shell scripting is required and you can get 「off the charts」 values for your system performance, let's have a look at the explanation below, viewing the process map for each system process can be obtained with this liner and then the below sections can be added to get the system or application memory usage if you wanna avoid the free tool calculations above.

Pmap usage
for i in `ps -eaf | grep 「any application process」 | grep -v grep | awk '{print $2}'`; do echo -n "PID $i actual memory usage is :" >> totaluse.txt; pmap -d $i | grep -i "writeable/private: " >> totaluse.txt; done

Now the same 「Memory used by Applications」 parameter can be doubly confirmed with above one liner too if you have time and want an accurate explanation of memory usage by application or system process , so doing a cat on totaluse.txt, you will see something like this:

{below is just an example and is nothing to do with the above calculation}

PID 16156 actual memory usage is :mapped: 15128K writeable/private: 10824K shared: 28K
PID 16158 actual memory usage is :mapped: 1381472K writeable/private: 1349680K shared: 2676K
PID 20220 actual memory usage is :mapped: 15196K writeable/private: 10892K shared: 28K
PID 20222 actual memory usage is :mapped: 2733700K writeable/private: 2708256K shared: 2932K
PID 27764 actual memory usage is :mapped: 15196K writeable/private: 10892K shared: 28K
PID 27766 actual memory usage is :mapped: 277176K writeable/private: 255552K shared: 3360K

Now if you add up the writable/private sections, you will get the exact memory usage of IV processes, I hope this helps. Adding those columns, its coming out nearing 4+GB, 10824+1349680+10892+2708256+10892+255552 KBs memory === nearing 4+ GBs.

To conclude, here we have discussed free and pmap to get the exact memory utilization and can avoid the vague top outputs in critical production scenarios.

Ref From: linoxide
Channels:

Related articles