|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Performance Co-Pilot Frequently Asked Questions
Trouble Shooting Performance Co-Pilot (PCP) is a framework and services to support system-level performance monitoring and performance management. The architecture and services are most attractive for those seeking centralized monitoring of distributed processing (e.g. in a cluster or webserver farm environment), or on large systems with lots of moving parts. However some of the features of PCP are also useful for hard performance problems on smaller system configurations. More details are avaliable on the main project page. What is the overall PCP Architecture? As shown below, performance data is exported from a host by the PMCD (Performance Metrics Co-ordinating Daemon). PMCD sits between monitoring clients and PMDAs (Performance Metric Domain Agents). The PMDAs know how to collect performance data. PMCD knows how to multiplex messages between the monitoring clients and the PMDAs.
How is PCP different from tools like vmstat, ps, top, etc? Each of these standard Linux tools:
Each of these standard Linux tools could also be re-implemented over the PCP protocols, in which case they would each:
As a proof of concept, pmstat in the PCP Open Source release is a re-implementation of vmstat using the PCP APIs. In addition, new PCP clients could be written to extend the functionality of the Linux tools, e.g.
Metrics, names, instances and values, ... eh? Performance Co-Pilot uses a single, comprehensive, data model to describe all available performance data.
Putting this altogether we can use pminfo to explore the available information.
$ pminfo filesys
filesys.capacity
filesys.used
filesys.free
filesys.maxfiles
filesys.usedfiles
filesys.freefiles
filesys.mountdir
filesys.full
$ pminfo -md filesys.free
filesys.free PMID: 60.5.3
Data Type: 64-bit unsigned int InDom: 60.5 0xf000005
Semantics: instant Units: Kbyte
$ pminfo -f filesys.free
filesys.free
inst [0 or "/dev/root"] value 3498272
inst [1 or "/dev/hda3"] value 20106
inst [2 or "/dev/hda5"] value 7747420
inst [3 or "/dev/hda2"] value 368432
Where is the Performance Metrics Application Programming Interface (PMAPI) documented? The PMAPI defines the interface between a client application requesting performance data and the collection infrastructure that delivers the performance data. There are "man" pages for every routine defined at the PMAPI. Start with "man 3 pmapi" for an overview.
See also Chapter 3 of the
Performance Co-Pilot Programmer's Guide.
Which application development languages are supported? Most agents and clients are written in C. Some clients are C++. There has been some experimentation with Perl, but C is the common choice. What is the nature of the communication between processes? The TCP/IP communication between PMCD and a monitoring client is connection-oriented for the most part. The when a connection is lost, the client library will automatically attempt reconnection to the PMCD with a controlled maximal rate of trying (uses a variant of exponential back-off). The error-handling regime for the clients already supports "no data currently available" for lots of reasons (like a PMDA is not installed or PMCD was restarted or lost the connection to PMCD), so there is typically very little that the client developer needs to do to handle this gracefully. For monitor clients, once the initial metadata exchanges with PMCD are complete, there is typically one message to PMCD and one message back from PMCD for each sample, independent of the number of metrics requested and the number of instances (or values) to be returned. pmlogger is a monitor client, so the same applies to communication between PMCD and pmlogger. At PMCD, each message from a monitor client is forwarded to one or more PMDAs, PMCD then collates the messages back from each PMDA that was asked to help and returns a single message to the client. It is an important part of the design that:
The communication between PMCD and the PMDAs uses TCP/IP or pipes or direct procedure calls (for DSO PMDAs). What is involved in fetching metrics from PMCD? The following high-level description follows the interactions between a monitoring client and PMCD to fetch metrics periodically.
To see all of the gory details, turn on PDU tracing and run simple pminfo commands, e.g. $ pminfo -D PDU kernel.all.cpu $ pminfo -D PDU -fdT kernel.all.load See also Metrics, names, instances and values, ... eh? Data aggregation and averaging in a PMDA? Mark D. Anderson <mda@discerning.com> asks: obviously a monitor can compute anything it likes, but can a monitor request that a agent do some server-side computation before sending the resulting data back, either across measurements (say, changing units or adding together), or across time (running average, etc.)? This is certainly possible, but we've tended to discourage it. Philosophically we believe any interval-based aggregation belongs in the monitoring clients. The PMDA cannot see the client state, so the PMDA does not know which client it is responding to at the moment, so you'd need to add some additional state using the pmStore(3) interface to selectively modify state in the PMDA from a client (this is typically used to toggle debug flags or enable optional instrumentation and changing units would be in this category). Can a monitor ask for qualitative events (e.g. threshold passing), instead of regular samples? Not directly. Use the Performance Metrics API (PMAPI) directly for periodic sampling (most of the PCP monitoring tools are like this). Use pmie for filtering and events. See also Synchonous versus asynchonous notification. How are triggers and alarms integrated to provide external notification? External notification usually means some combination of e-mail, paging, phone-home or posting to an event clearinghouse. pmie is the PCP tool for automated monitoring and taking predicated actions. pmie's actions are arbitrary; there are some canned ones, but then there is a general "execute this command" action. The latter has been used to do pager events, and integrate events into larger system management frameworks like OpenView, UniCenter TNG, Enlighten DSM, ESP (from SGI). Synchonous versus asynchonous notification? The model for shipping values of the performance metrics from PMCD to the monitoring clients is "synchronous pull" where the clients explicitly ask for data when they want it. There is no push, broadcast, callback or other asynchronous notification for the values of performance metrics, although pmie can be used to perform period sampling and raise asynchronous alarms (of any flavour) when something interesting happens. For more details refer to the Performance Co-Pilot Programmer's Guide
Do you try to synchronize clocks? No. The clients receive one timestamp from PMCD with each group of values returned, so the only issue is skew when a monitoring client is processing performance data from more than one host or more than one archive. This is not a real problem in most cases because PCP is aiming at system-level performance monitoring, with a bias for large systems, so sampling rates are of the order of a few seconds up to tens of minutes. We are not trying to do event traces that require microsecond accuracy in the timestamps. Is there an optimized mechanism for local monitoring? Yes. Applications wishing to avoid the overhead of connection to PMCD and communication over TCP/IP may extract operating system performance data directly using the DSO implementation of the PMDA. The same application can decide at run-time to use either the regular or the express access path. See PM_CONTEXT_LOCAL in pmNewContext(3). There is no client or server authentication, no encryption, etc. In practical terms, the data is not all that interesting and we have experienced very few issues in production sites over many years. A simple access control model is used, namely the PMCD daemon and the pmlogger processes support an IP-based allow/disallow mechanism for client connections on some or all network interfaces. If you re-build PCP from the source RPM and use "make install" to do the installation (as opposed to an RPM-based installation), some manual post-installation steps will be required. In particular the "PMNS appears to be empty!" message from any PCP monitoring tool means the Performance Metrics Name Space (PMNS) has not been correctly set up. To fix this, # touch /var/pcp/pmns/.NeedRebuild # /etc/rc.d/init.d/pcp startelse if you are not starting pmcd this way, the brute-force method is, # cd /var/pcp/pmns # ./Rebuild -du Resource utilization greater than 100%? Mail received from Nicholas Guillier nicolas.guillier@airbus.com on Wed, 30 Jun 2004. I use PCP-2.2.2-132 to remotly monitor a linux system. I sometimes face a strange problem: between two acquisitions, the consumed cpu time is higher than the real time! Once turned into a percentage, the resulting value can reach up to 250% of cpu load! This case occurs for kernel.cpu.* metrics and with disk.all.avactive metric as well (both from linux pmda). First cpu time and disk active time are both really counters in units of time in the kernel, so the reported value for the metric v requires observations at times t1 and t2, then reporting the rate (actually time/time, so a utilization) as (v(t2) - v(t1)) / (t2 - t1) The sort of perturbation you report occurs when the collector system (PMCD and PMDAs) is heavily loaded. The collection architecture assigns one timestamp per fetch, and if the collection system is heavily loaded then there is some (non-trivial in the extreme case) time window between when the first value in the fetch is retrieved from the kernel and when the last is retried from the kernel. Let me try to explain with an example with two counter metrics, x and y with correct values as shown below
Now on a lightly loaded system, if we consider 3 samples at t=1, t=4 and t=7, and [x] is the timestamp associated with the returned values:
And the reported rates would be correct, namely
Now on a heavily loaded system this could happen ...
And the reported rates would be ...
So, the delayed fetch at time 4 (which does not return values until time 5) produces:
You're noticing the second case. Note that because these are counters, the effects are self-cancelling and diminish over longer sampling intervals. There is nothing inherently wrong here.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||