How to use profile and top in Gluster ?

In this article I am going to show you how to use profiling and top feature in gluster. These utilities help us to troubleshoot the performance issues with glusterfs.

Step 1 : Profiling help to generate the report which will help to troubleshoot the performance issues. We can see the two new options coming into volume info as soon as we enable the profiling.

[root@Node2 tmp]# gluster vol profile RepVol1 start
Starting volume profile on RepVol1 has been successful

[root@Node2 tmp]# gluster vol info RepVol1

Volume Name: RepVol1
Type: Replicate
Volume ID: 4f5301f0-a436-4b19-b653-9613a260988b
Status: Started
Snap Volume: no
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: Node1:/Replicated/RepBrickNode1
Brick2: Node2:/Replicated/RepBrickNode2
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.readdir-ahead: on
snap-max-hard-limit: 256
snap-max-soft-limit: 90
auto-delete: disable

Step 2 : Lets check the statistics now.

[root@Node2 tmp]# gluster vol profile RepVol1 info
Brick: Node2:/Replicated/RepBrickNode2
————————————–
Cumulative Stats:
Block Size:                  8b+              131072b+
No. of Reads:                    2                     0
No. of Writes:                    0                  2000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
———   ———–   ———–   ———–   ————        —-
0.00       0.00 us       0.00 us       0.00 us             18     RELEASE
0.00       0.00 us       0.00 us       0.00 us             14  RELEASEDIR
18.37     241.00 us     241.00 us     241.00 us              1      LOOKUP
18.83     247.00 us     247.00 us     247.00 us              1    GETXATTR
62.80     412.00 us     276.00 us     548.00 us              2     READDIR

Duration: 7178 seconds
Data Read: 18 bytes
Data Written: 262144000 bytes

Interval 0 Stats:
Block Size:                  8b+              131072b+
No. of Reads:                    2                     0
No. of Writes:                    0                  2000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
———   ———–   ———–   ———–   ————        —-
0.00       0.00 us       0.00 us       0.00 us             18     RELEASE
0.00       0.00 us       0.00 us       0.00 us             14  RELEASEDIR
18.37     241.00 us     241.00 us     241.00 us              1      LOOKUP
18.83     247.00 us     247.00 us     247.00 us              1    GETXATTR
62.80     412.00 us     276.00 us     548.00 us              2     READDIR

Duration: 7178 seconds
Data Read: 18 bytes
Data Written: 262144000 bytes

Brick: Node1:/Replicated/RepBrickNode1
————————————–
Cumulative Stats:
Block Size:                  8b+              131072b+
No. of Reads:                    0                     0
No. of Writes:                    1                  2000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
———   ———–   ———–   ———–   ————        —-
0.00       0.00 us       0.00 us       0.00 us             18     RELEASE
0.00       0.00 us       0.00 us       0.00 us             13  RELEASEDIR
17.53     251.00 us     251.00 us     251.00 us              1      LOOKUP
24.51     351.00 us     351.00 us     351.00 us              1    GETXATTR
57.96     415.00 us     266.00 us     564.00 us              2     READDIR

Duration: 7179 seconds
Data Read: 0 bytes
Data Written: 262144009 bytes

Interval 0 Stats:
Block Size:                  8b+              131072b+
No. of Reads:                    0                     0
No. of Writes:                    1                  2000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
———   ———–   ———–   ———–   ————        —-
0.00       0.00 us       0.00 us       0.00 us             18     RELEASE
0.00       0.00 us       0.00 us       0.00 us             13  RELEASEDIR
17.53     251.00 us     251.00 us     251.00 us              1      LOOKUP
24.51     351.00 us     351.00 us     351.00 us              1    GETXATTR
57.96     415.00 us     266.00 us     564.00 us              2     READDIR

Duration: 7179 seconds
Data Read: 0 bytes
Data Written: 262144009 bytes

If you want to get the information related to NFS only.

[root@Node2 tmp]# gluster vol profile RepVol1 info nfs
NFS Server : localhost
———————-
Cumulative Stats:

Duration: 1248 seconds
Data Read: 0 bytes
Data Written: 0 bytes

Interval 0 Stats:

Duration: 1248 seconds
Data Read: 0 bytes
Data Written: 0 bytes

NFS Server : Node1
——————
Cumulative Stats:

Duration: 1251 seconds
Data Read: 0 bytes
Data Written: 0 bytes

Interval 0 Stats:

Duration: 1251 seconds
Data Read: 0 bytes
Data Written: 0 bytes

NFS Server : Node3
——————
Cumulative Stats:

Duration: 1247 seconds
Data Read: 0 bytes
Data Written: 0 bytes

Interval 0 Stats:

Duration: 1247 seconds
Data Read: 0 bytes
Data Written: 0 bytes

Step 3 : Don’t forget the disable the profiling on volume otherwise it will hit your performance badly.

[root@Node2 tmp]# gluster vol profile RepVol1 stop
Stopping volume profile on RepVol1 has been successful

Step 4 : Using the top option of the glusterfs.

If we want to list the files which are opened frequently.

[root@Node2 tmp]# gluster vol top RepVol1 open
Brick: Node2:/Replicated/RepBrickNode2
Current open fds: 1, Max open fds: 3, Max openfd time: 2014-12-26 22:36:23.83448
Count           filename
=======================
9               /file1
Brick: Node1:/Replicated/RepBrickNode1
Current open fds: 1, Max open fds: 3, Max openfd time: 2014-12-26 22:36:35.67090
Count           filename
=======================
9               /file1

We can limit this to only one particular brick.

[root@Node2 tmp]# gluster vol top RepVol1 open brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2
Current open fds: 1, Max open fds: 3, Max openfd time: 2014-12-26 22:36:23.834480
Count           filename
=======================
9               /file1

Files which are opened frequently for reading.

[root@Node2 tmp]# gluster vol top RepVol1 read brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2
Count           filename
=======================
2               /file1

Files which are opened very frequently for write.

[root@Node2 tmp]# gluster vol top RepVol1 write brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2
Count           filename
=======================
400             /file15
400             /file14
400             /file13
400             /file12
400             /file11
3               /.file1.swp

If you want to list the directories accessed frequently. In this case brick directory itself is accessed maximum number of times.

[root@Node2 tmp]# gluster vol top RepVol1 opendir brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2

If you want to check the read performance of particular brick.

[root@Node2 tmp]# gluster vol top RepVol1 read-perf brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2
MBps Filename                                        Time
==== ========                                        ====
0 /file1                                          2014-12-26 21:18:46.989808

If you want to check the write performance of particular brick.

[root@Node2 tmp]# gluster vol top RepVol1 write-perf brick Node2:/Replicated/RepBrickNode2/
Brick: Node2:/Replicated/RepBrickNode2
MBps Filename                                        Time
==== ========                                        ====
0 /.file1.swp                                     2014-12-26 22:36:27.878776
0 /file15                                         2014-12-26 22:32:58.171973
0 /file14                                         2014-12-26 22:32:47.906899
0 /file13                                         2014-12-26 22:32:37.833390
0 /file12                                         2014-12-26 22:32:27.17147
0 /file11                                         2014-12-26 22:32:16.884791

Notably I have issued these commands in my test environment hence not showing significant numbers here but in production these commands can really be helpful to troubleshoot the performance issues.

Key point : You can limit the number of files displayed in the output of top commands using list-cnt <value>. Where value is integer specifying the number of files you want to list.

Advertisements

2 thoughts on “How to use profile and top in Gluster ?

    1. Vikrant Post author

      It depends. Whether your nodes in trusted storage pool are taking LUN from SAN or using local disks. If SAN is the case then you may reach your storage team as well. Gluster totally works at software level.

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s