Don’t you hate it when you’re given the choice between a few tools, and all of them are real nifty, but are missing a (different) key feature?
Today, I had that tantalizing problem with bug tracking systems. As the new maintainer of DBD::Oracle, I’m trying to wrap my head around the 32 tickets currently open against it. Many tickets are platform-dependent, or appear for a specific version of Oracle. It would be fantastic to be able to tag the different tickets with those details, but alas CPAN’s RT doesn’t have tags. Booh!
But… GitHub Issues have tags. Wonderful. Except that… as of now, GitHub Issues doesn’t support priorities. Hiss!
So it all boils down to what do I want most (or, on the flip side, what I can live without), priorities or tags? Another factor, entropy, will probably make me keep RT as the official bug tracking system. But, migrating or not, I would still like to have a copy of my RT tickets on GitHub, just so that I can use the tags, and integrate the tickets more tightly with the development flow.
Fortunately, with the help of RT::Client::REST and Net::GitHub, it’s quite easy to export our tickets.
First, we create our github and rt clients:
use RT::Client::REST; use RT::Client::REST::Ticket; use Net::GitHub::V2::Issues; my $github_user = 'yanick'; my $github_token = 'deadbeef'; my $github_repo_owner = 'yanick'; my $github_repo = 'DBD-Oracle'; my $rt_user = 'pythian'; my $rt_password = 'secret'; my $rt_dist = 'DBD-Oracle'; my $gh = Net::GitHub::V2::Issues->new( owner => $github_repo_owner, repo => $github_repo, login => $github_user, token => $github_token, ); my $rt = RT::Client::REST->new( server => 'https://rt.cpan.org/' ); $rt->login( username => $rt_user, password => $rt_password );We don’t want to migrate tickets over and over again, so we look at our github tickets, and see if they relate to rt tickets. To keep things simple, I’ll make sure that tickets coming from RT contain the ticker number in their title.
# see which tickets we already have on the github side my @gh_issues = map { /\(rt(\d+)\)/ } map { $_->{title} } @{ $gh->list('open') || [] }; say join ' ', 'github issues:', @gh_issues;After that, we query RT for all the active tickets, and try to export them:
export_ticket( $_ ) for $rt->search( type => 'ticket', query => qq{ Queue = '$rt_dist' and ( Status = 'new' or Status = 'open' ) }, );The export function is pretty simple. As mentioned before, we first check that the ticket doesn’t already exist on the github side:
sub export_ticket { my $id = shift; say "ticket $id"; return say "already on github" if $id ~~ @gh_issues;If it doesn’t, we import the main ticket information, plus the description of the ticket (contained in its first transaction):
# get the information from RT my $ticket = RT::Client::REST::Ticket->new( rt => $rt, id => $id, ); $ticket->retrieve; # we just want the first transaction, which # has the original ticket description my $desc = $ticket->transactions->get_iterator->()->content;Finally, we create the github issue with those pieces of data:
# create the github ticket my $gh_ticket = $gh->open( $ticket->subject . ' (rt' . $_ . ')', "https://rt.cpan.org/Ticket/Display.html?id=$_\n\n" . $desc ); # and assign it the label 'rt' $gh->add_label( $gh_ticket->{number} => 'rt', ); }The full script is available here.
And that’s all there is to it. It’s simple, but it gets the job done.
In this post:
Tom Kyte, Cary Millsap, Steven Feuerstein hold a Database Guru Panel at #Kscope 11′s conference organized by ODTUG. I had a great opportunity to not just watch the Live Stream (you can see th recorded version HERE) but also ASK questions ;). Big kudos to organizers for the fantastic opportunity to ask questions via Twitter and Face Book live. I wasn’t sure if it was going to work . Surprisingly enough several of my questions had been asked and answered live. One of the questions was related to UNDO/REDO statistics. I think that it was a bit of challenge to define the question precisely having just 140 Twitter characters and therefore I decided to share my question here, give a bit of background and ask you “What is your opinion?”
Question and backgroundThe following is the question I asked in the twitter (16 min 14sec in the recording ).
Tweet: #kscope Q to Guru Panel: What is the best way to find TOP SQLs generating the most UNDO/REDO in 11GR2?
Tom Kyte response: My first response to that would be “Why?”
Well … This questions is coming from my day to day Oracle DBA duties. On several occasions I faced situations when an Oracle database (Instance) all over the sudden started to generate significantly more REDO data then usually. This quickly may “eat” all space allocated to archive log destinations and generate many other problems quite quickly. In such cases an Oracle DBA uses any diagnostics options available to find the root cause of the problem and resolve it in as short time as possible. My question is what Oracle DB kernel instrumentation do we have in place to troubleshoot such cases?
NOTE: It might be that word “SQLs” in my original question text introduced a bit of confusion. I really should have said DML operations instead of SQL. However I think many of us use word SQL for DML operations inclusively (including Oracle in V$SQL views ;).
My opinionIn my opinion we are missing a bit of instrumentation here. V$SQL, V$SQLAREA and other Oracle view are focused on Reads related statistics. This is understandable as most of operations a database is doing are data retrieval and Oracle did a great job here. However for the use case I described it would be very useful to have a set of additional columns in these views that would provide us with data changes related statistics.
What other options/instrumentation we have for finding the place in an application that suddenly started to generate a lot of data changes? Well there are several. However none of those provides us with a way to point the SQL (DML) that is a root cause of the problem. The options that I can think about are as following:
You would say that combining all methods named above we could narrow the scope and find the root cause of issue. And you are right. In no other options available we use any diagnostics to find the root cause. However I would say that if V$SQL… views would have UNDO/REDO statistics we could point to the SQL causing the issue in no time. I think there is a place for enhancement request.
What do you think? Have you been in such situations as I described? What options did you use to find the SQL responsible for the problem? Does Oracle need to add UNDO/REDO related statistics in database kernel instrumentation?
Please do not hesitate to leave your comment for this blog post.
Once again this week, you are holding the new edition of Log Buffer in your hands. From across the database planet, fresh and sizzling blog posts of Oracle, SQL Server and MySQL are presented just for you in Log Buffer #230. Enjoy and give your feedback in the comments.
Oracle:
Have you ever wondered about the V$OPEN_CURSOR view, the SESSION_CACHED_CURSORS parameter, and the two session-level statistics “session cursor cache count” and “session cursor cache hits”? Charles Hooper asks.
The Exadata IO Resource Manager (IORM) is a fundamental piece of the exadata database machine since the beginning. The function of the IORM is to guarantee a database and/or category (set via the database resource manager) gets the share of IO as defined in the resource plan at each storage server. Frits Hoogland has more.
Dough Burns shares some interesting tidbits.
Martin Wildake thinks that the Index Organized Tables(IOTs) are a much under-used and yet very useful feature of Oracle.
You open a ticket and wait… When you do get a reply it tells you to send information you’ve already posted, or suggests you try some workarounds you’ve already listed in the ticket as having not worked for you. Sounds familir? read on here.
SQL Server:
Kevin Kline is considering himself lucky and sharing something he is reading which he believes that the readers will enjoy too.
Replication is the SQL equivalent of Marmite as far as many people are concerned (you either love it or hate it, Richard Douglas has more.
Do you get error 435 in the Application event log and your SQLAgent fails to start? Ryan Adams asserts that the reason you’re getting this error is that the SQLAgent error log and working directory file locations are not available.
It’s been clear for a while now that Sharepoint is at the heart of Microsoft’s BI strategy. Chris Webb shows the signs.
Michael Swart is introducing a small series about ACID properties as it applies to databases.
MySQL:
When does InnoDB compress and decompress pages? Nav has the answer in this blog post.
Gerardo Narvaja has written a post about the capabilities and issues regarding MySQL SNMP agent.
Kentoku Shiba is pleased to announce the release of Spider storage engine version 2.26(beta) and Vertical Partitioning storage engine version 0.15(beta). Spider is a Storage Engine for database sharding. Vertical Partitioning is a Storage Engine for vertical partitioning for a table.
It is common practice in MySQL table design that fields are declared as NOT NULL but some non-sense DEFAULT values are specified for unknown field contents. In this article Shinguz show why this behavior is non optimal an why you should better declare a field to allow NULL values and use NULL values instead of some dummy values.
“List me all databases that have a current replication delay of more than 10 seconds.”
“Easy. Let’s fetch the data from Merlin.” And that is how it started.
In this post:
Disclaimer: Please note that the conclusions in this blog post are my own personal opinions.
Oracle Fusion Applications 11G is released and available
For all of you who haven’t yet noticed, Oracle Fusion Applications 11G is now available! You can get the installation from the Oracle E-Delivery site in no time. This is the moment many of us have been waiting for, if not for many years then for many months for sure. For me the media availability isn’t the most important thing – it is certainly a very important factor and I am planning to get my hands dirty with it very soon (stay tuned ;). Keeping in mind the amount of effort Oracle have invested in the product and the fact that the product is going to be a flagship ERP system on the market for at least next 10 years the most important thing is to get an access to the documentation. For me to understand the concepts and fundamental principals of the product is much more important than getting it installed ASAP ( this must be a sign of age , *sigh* ;)
My initial conclusions
I am at the very beginning of the long, long, long journey. Have just started to read “Fusion Applications Installation Guide” this week. A small warning here. All documentation available at the moment has a “Beta Draft” stamp. Therefore some things may change. However, the main principals are there and I am happy to share my first conclusions as I (as Oracle Apps DBA) draw from my reading:
Other thoughts
Just few other comments:
This is a follow up post to answer questions you asked during my “Oracle 11G SCAN: RAC SIG webcast” at June 16th.
You can get the full presentation HERE – “Oracle 11G SCAN: Concepts and implementation experience sharing”. If you are not a member please feel free to join the RAC SIG. The membership is free and it will take you just couple of minutes to get registered (hint: check your spam folder if you don’t see password email from RAC SIG).
Australia and Europe Webinar is scheduled (July 27th 2011)If you missed the “Oracle11G SCAN: Sharing successful implementation experience” Webinar then you have a chance to see it. For Australia – Europe time zones I am repeating it at July 27th.
If you are interested please feel free to register:
Australia - I would love you to go through Sydney Oracle Meetup site
For All others - please register HERE
Thanks for all your questions guys you asked during the presentation! Some of them deserve a whole presentation to make the answer complete. Let see if I will find enough motivation and time to follow up with a SCAN2 presentations. Meanwhile let me give you some short answers and possible hints. Below are ALL questions I recorded during the Webinar. I tried my best to provide the best answers I can give in limited time frame I had. If you have any further questions or related comment please be my guest and post a comment under this post.
Question from Rick Blythe:
Q: What happens during failure events ? – How scan connection still works when node dies? When local listener dies/when scan listener dies… ?
A: The main component that makes the solution agnostic to nodes’ failures are 3 SCAN listeners and related IP addresses. Those listeners are Cluster wide services meaning that any of 3 listeners are not bind to any particular node and can run on any of the nodes. If one of the nodes crashes than the following happens:
- Survived nodes will have at least 1 SCAN listener running
- Short time (a minute) after the crash SCAN Listener records got refreshed and new client request are not forwarded to the crashed node anymore
- Failed SCAN listeners got started on survived nodes
- After a minute (default refresh time) new SCAN listener’s instances know about the DB instances that serves certain services configured and start function as before.
I think during the crash itself there are a small timeframe when client connections may get some errors alike the following. However in 1-2 minutes it all gets to normal:
The exact SCAN information refresh process is to be investigated. I am guessing Oracle Process and Notification service might be part of the process as each of the listeners keeps connection to “127.0.0.1:6100″ socket.
Question from Anand Sathyamurthy:
Q: Yury great presentation, great thoughts. How to connections in ebsusinessuite in tns_admin
A: Thank you Anand for the great question. Since certain ATG autoconfig templates version Oracle officially supports SCAN. Please see the following note:
Using Oracle 11g Release 2 Real Application Clusters with Oracle E-Business Suite Release 12 [ID 823587.1]
If you for one reason or another are not on the right ATG level you may be interested to have a look on the following my blog post.
EBS DBA: Integrating EBS with Oracle 11G Cluster Infrastructure (TNS_ADMIN).
Questions from Pascal:
Hello Yury, thank you for a Great Presentation!
Q1: I have a Question regarding SCAN and Static Listeners. How to configure or integrate Static Listeners in a RAC Environment which is using SCAN? This is required for instance when using Data Guard. Do we have to configure the Static Listener SID_LIST Entries under the LISTENER_SCAN* entries in the listener.ora File also?
A: My first guess would be to make sure that your local “static” listener is listed in “local_listener” init.ora parameter. This way Oracle instance would automatically register the local listener to SCAN listeners. A local DB Instance users remote_listener parameter to contact SCAN listeners.
Q2: Another Question regarding EZConnect and Connect Time Failover Is it possible to use EZConnect Method with Load Balancing/Connect Time Failover when using SCAN?
A: I would say that Failover and Load Balancing options are build-in into SCAN concept. You do not need to specify those anymore. You just specify the SCAN connection string as described in my presentation and both features are enabled for your clients by default. For more information on EZConnect Method click HERE. See “Understanding Failover, Load Balancing, and SCAN” for additional information on what is going on the server side of things.
Q3: Is it supported to use the old style connect strings using VIPs in the tnsnames.ora File for connecting to RAC 11gR2?
REF: Understanding the Oracle Real Application Clusters Installed Configuration
“Clients configured to use VIP addresses for Oracle RAC releases prior to Oracle Database 11g release 2 can continue to use their existing connection addresses; using SCAN is not required.”
Question from Yong:
Q: Yury: Do regular and SCAN listeners both have to know node load stats? If yes, isn’t it duplicate?
A: REF Database Services for an Oracle RAC Database.
I think that you are right and the load stats (workload) got registered in both listeners. The only listener that is using the information provided in “proper” SCAN configuration is SCAN listener. However!!! (REF: previous answer) SCAN doesn’t limit the previous VIP address load balancing configuration on the client side. Oracle registers workload on both SCAN and VIP listeners using service registration process to make both load balancing options available.
Question from Babu Srinivasan:
Q: so if i stop and start .. will it interrrupt the new connections?
A: If you stop start a SCAN listener it will not interrupt existing or new connections. There may be a short time-frame (less than a minute) when SCAN listener got the services related information from local listeners (DB instances). During that time-frame your clients might see the following errors:
Question from Babu Srinivasan:
Q: endpoints listener.ora for clusters .. what should be the entries in them .. how it is managed?
A: REF: How Oracle Database Uses the Listener File (listener.ora)
In ideal case you do not edit the listener.ora file nor related local_listener parameter. The endpoints are the ADDRESS part of the listener.ora file. You can create an additional listeners using the following syntax or modify and add additional endpoints to your configuration using srvctl utility.
[sode]srvctl add listener -p TCP:1522 -o /u01/app/oracle/product/11.2.0/dbhome_1
srvctl modify listener -n mynode1 -p “TCP:1521,1522″
srvctl modify listener -l crmlistener -p TCP:1522[/code]
However sometimes in emergency cases you may want to implement a "dirty fix" and make the manual adjustments (I did it on couple of occasions). However after got the emergency under control I would strongly suggest you to research on the issue you faced and come with an action plan to convert the setup to the officially supported. Otherwise you may end up with a bigger problems (mess).
Question from Deepak:
Q: What is the problem using “Oracle” user instead of “Grid” – which I thought is really an optionall user during GI install?
A: Well. Strictly speaking there is no problem running both components under the same OS user. However starting from 11GR2 there are much cleaner boundaries in between Grid and Oracle components. If you isolate the Grid services under separate user you protect it from everything else. On the other hand if you have multiple RAC databases (and instances) running on the same host you may want to run those under a separate oracle users. Keeping Grid and Oracle separated makes sense to me after I worked with it on several occasions. However the solution that works in one case may not be suitable for others. If you run a small configuration or/and your DBA/Maintenance team is small (no need to separate responsibilities) you may find that running all under the same user is a better idea. In any circumstances I would keep Cluster and RDBMS oracle homes separate (I am guessing that it isn’t possible to keep those in the same oracle home starting from 11GR12).
Question from David Start:
Q: Have you run into any issues with clients on windows caching the first ip address that dns returns and not round robining
----------------------------------------------------------------
A: Yes I do remember some issues related to the fact that some Windows versions cached DNS entries. However I just tested it on my patched XP workstation and don’t see it anymore. I am guessing it is very OS specific. This is a very good point tho and WARNING for others. Please keep this in mind. Some OS in certain circumstances can cache DNS entries. This may prevent your and users from connection to Oracle services in case of one of the 3 SCAN failures.
Recently while working on a client site I discovered that it takes 30-90 seconds to make an ssh connection to one of the servers. Connections between servers for this client typically take < 1 second, so the lengthy connection time was definitely out of order.
If you are familiar with debugging ssh connections you are probably familiar with the ‘-v’ option that directs ssh to output verbose comments stating which operation is currently taking place. You can add up to three -v options on the command line, increasing the verbosity with each one. An example follows:
A normal ssh connection 16:20:jkstill-22 > ssh 192.168.1.132 Last login: Wed Jul 13 16:20:53 2011 from poirot.jks.com [ /home/jkstill ] 0:0-oravm01:vmdb01:jkstill-12 > Connect with a single -v command line options 16:21:jkstill-22 > ssh -v 192.168.1.132 OpenSSH_5.2p1, OpenSSL 0.9.7g 11 Apr 2005 debug1: Reading configuration data /usr/local/etc/ssh_config debug1: Connecting to 192.168.1.132 [192.168.1.132] port 22. debug1: Connection established. debug1: identity file /home/jkstill/.ssh/identity type -1 ... debug1: Authentication succeeded (publickey). debug1: channel 0: new [client-session] debug1: Entering interactive session. Last login: Wed Jul 13 16:21:00 2011 from poirot.jks.com Now connect with three -v command line options 16:22:jkstill-22 > ssh -v -v -v 192.168.1.132 OpenSSH_5.2p1, OpenSSL 0.9.7g 11 Apr 2005 debug1: Reading configuration data /usr/local/etc/ssh_config debug2: ssh_connect: needpriv 0 debug1: Connecting to 192.168.1.132 [192.168.1.132] port 22. debug1: Connection established. debug1: identity file /home/jkstill/.ssh/identity type -1 debug3: Not a RSA1 key file /home/jkstill/.ssh/id_rsa. debug2: key_type_from_name: unknown key type '-----BEGIN' debug3: key_read: missing keytypeSo, now I can connect to the remote server, observe the screen and see which operations of the connection are taking a long time.
There are however at least two problems with this:
If only ssh provided some method for showing the timing of each operation. Then I could create a log file of it, bundle it up into a zip file and send it to the system administrator with an explanation of the problem, and corroborating data show where ssh was having problems.
As this was a linux system, I could have just used strace. Strace is too granular for this purpose – it provided far more data than I wanted.
And like it or not, sometimes I am not working on my favorite OS. Sometimes I work on Windows, and it would be nice to have some method of obtaining timing data that would work on many platforms.
About this time I had an idea. They don’t occur all that often, so I was glad to have one to entertain.
The diagnostic output from most programs is usually unbuffered, that is, it is written directly to standard output or error output as soon as it occurs. If that is the case, then the time between each line of output is the amount of time the operation required to run.
Ah hah! Why don’t I just collect the time between each line, and report the difference?
Doing so would allow me to see which parts of ssh are taking so much time.
Being something of a Perl aficionado I set about creating a utility to accept text into standard output, and report the time difference separating the arrival of each line.
While it did not turn in to a daunting task, if you’re at all familiar with Oracle 10046 trace files you will have some knowledge of the difficulty of timing events. It is not always straight forward. For instance, sometimes the timing for the event appears before the event itself appears in the output.
To keep this simple, useful and maintainable, the following goals were set for this new utility:
Thus perltrace.pl was born (known as perltrace in the rest of this article)
Perltrace is meant to be used in a pipeline. Rather than trace an ssh operation, I used perltrace to get timing data for an scp operation. The reason for doing so is that the ssh eventually succeeds so that I am connected to a new server. As I am logging the connection details it made more sense to use scp, and it must do most everything that ssh does, that is, verify authentication and make a connection to the remote host.
Here’s the command line I used, along with some initial lines from the output:
scp -v -v somefile somuser@someserver:~/ | perltrace.pl | tee scp_trace.log 0.000027 Executing: program /usr/bin/ssh host someserver, user (unspecified), command scp -v -t ~/ 0.000005 OpenSSH_4.3p2, OpenSSL 0.9.8b 04 May 2006 0.000003 debug1: Reading configuration data /etc/ssh/ssh_config 0.000003 debug1: Applying options for * 0.051990 debug2: ssh_connect: needpriv 0 0.003796 debug1: Connecting to someserver [192.168.1.80] port 22. 0.000146 debug1: Connection established. 0.000035 debug1: identity file /home/username/.ssh/identity type -1 0.000024 debug3: Not a RSA1 key file /home/username/.ssh/id_rsa. ...Now with a log file in hand that included timing data, it was quite simple to isolate the ssh components that were consuming the most time:
16:22:jkstill-22 > sort -n scp.log | tail -10 0.009709 debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY 0.010983 debug1: loaded 3 keys 0.043485 debug2: channel 0: rcvd adjust 131072 0.048267 debug1: SSH2_MSG_SERVICE_REQUEST sent 0.048332 debug2: channel 0: written 16 to efd 6 0.048438 debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP 0.051990 debug2: ssh_connect: needpriv 0 10.004512 debug2: we sent a publickey packet, wait for reply 20.026335 debug1: read PEM private key done: type RSA 60.053010 debug1: Entering interactive session..I’ve done my bit – this information can be (and was) sent off to the system administrator to help troubleshoot the issue.
So, how does this all work? We can walk through the relevant bits of the code, and the scripts will be available for download.
First of all, here’s the help information
jkstill-6 > perltrace.pl -help perltrace.pl usage: perltrace.pl - get timing information from program output perltrace.pl -cmdfirst|nocmdfirst -wallclock -totals -cmdfirst output precedes execution -nocmdfirst output follows exection -wallclock include wall clock time -totals include start,stop and total elapsed time eg: perltrace.pl -option1 parameter1 -option2 parameter2 ... time scp -v -v somefile somuser@someserver:~/ 2>&1 | perltrace.pl -cmdfirst > t.logSo that the perltrace can be tested, there is a shell script perltrace_ut_data.sh. The shell scripts performs a series of sleeps in a loop, optionally printing the command before or after it has occurred. The entire script appears here:
perltrace_ut_data.sh #!/bin/sh # default is show cmd first function usage { echo $(basename $0) echo "-c show command then execute" echo "-n execute then show command" } while getopts cn arg do case $arg in c) ORDER='CMDFIRST';; n) ORDER='NOCMDFIRST';; *) echo "invalid argument specified"; usage;exit 1; esac done # default to CMDFIRST : ${ORDER:='CMDFIRST'} echo "using $ORDER" >&2 for sleeptime in .5 .75 .25 .1 .9 1.2 .1 .2 .3 do if [ "$ORDER" == 'CMDFIRST' ]; then echo sleeping for $sleeptime seconds sleep $sleeptime else sleep $sleeptime echo sleeping for $sleeptime seconds fi doneHere is an example of using perltrace_ut_data.sh to test and verify perltrace.pl:
jkstill-6 > ./perltrace_ut_data.sh -c | perltrace.pl -cmdfirst -wallclock -totals using CMDFIRST 2011-07-13 21:17:29 0.465963 sleeping for .5 seconds 2011-07-13 21:17:29 0.751859 sleeping for .75 seconds 2011-07-13 21:17:30 0.252734 sleeping for .25 seconds 2011-07-13 21:17:30 0.102787 sleeping for .1 seconds 2011-07-13 21:17:30 0.902740 sleeping for .9 seconds 2011-07-13 21:17:31 1.201655 sleeping for 1.2 seconds 2011-07-13 21:17:33 0.102713 sleeping for .1 seconds 2011-07-13 21:17:33 0.201860 sleeping for .2 seconds 2011-07-13 21:17:33 0.302074 sleeping for .3 seconds Start Time: 2011-07-13 21:17:29 End Time: 2011-07-13 21:17:33 Elapsed: 4.285667 The guts of perltraceHere I have included the relevant bits of perltrace.pl, stripped of debugging code and other extraneous bits that get in the way of understanding how it works.
First, the code, and then some explanations.
my @printline; if ($cmdfirst){ $printline[1]='' } my $firstline = 1; while (<>) { chomp; $line=$_; shift @printline, push @printline,$line; # if this is the first line then we do not yet have timing information if ($firstline) { $firstline = 0; # pick up first line timing if -nocmdfirst unless ($cmdfirst) { ($endTime) = [gettimeofday]; $tdiff = tv_interval($startTime, $endTime); printf "%s%$secondsFormat %s\n", ($wallClock ? $wallClockTime . ' ' : '') , $tdiff,$printline[0]; ($startTime) = [gettimeofday]; $wallClockTime = strftime "%Y-%m-%d %H:%M:%S", localtime; } next; } ($endTime) = [gettimeofday]; $tdiff = tv_interval($startTime, $endTime); printf "%s%$secondsFormat %s\n", ($wallClock ? $wallClockTime . ' ' : '') , $tdiff,$printline[0]; ($startTime) = [gettimeofday]; $wallClockTime = strftime "%Y-%m-%d %H:%M:%S", localtime; } # pick up the last line if cmdfirst # if no cmdfirst we already got timing info inside the loop if ($cmdfirst) { shift @printline, push @printline,$line; $endTime=[gettimeofday]; $tdiff = tv_interval($startTime, $endTime); printf "%s%$secondsFormat %s\n", ($wallClock ? $wallClockTime . ' ' : '') , $tdiff,$printline[0]; } Line 4This line probably doesn’t look very important, but in fact it is the key to printing the timing information with the correct line from the input.
There are two different types of input we expect to see. As specified by the perltrace.pl options they are -cmdfirst and -nocmdfirst, with -cmdfirst being the default.
If we want to associated the time with the correct operation, it is necessary to know whether the diagnostic output appears before the operation, or after the operations.
So line 4 is used to ensure there are 2 elements in the printline array. If the -nocmdfirst option is used, there will initially be no elements in the printline array. This will make more sense as we go to line 14.
Line 14Here the topmost element of the array is shifted out. If there are no elements then nothing is shifted out, and the array remains empty. So if the -cmdfirst option was used, the current line of input will be placed in $printline[1]. If -nocmdfirst is in force, then the current line of input will be in $printline[0].
Lines 17-31These lines deal with handling the first line of input from the pipe, and as such this block is executed only once. If -nocmdfirst is enforced, we pick up the timing information and go to the top of the loop. If however -cmdfirst is enforced, we skip the timing information. Why skip it? Since we expect the command to appear before it is executed, we just go back to the top of the loop and wait for the next line of input. When that line appears, we know the previous operation has completed and we can print the command and the timing operation.
Lines 33-40These lines are executed for every remaining line of input. Line 33 gets the current time, and line 35 computes the elapsed time for the current operation. Line 37 is quite interesting, as this is the line that prints the output with the timing information. If -wallclock was included as a command line option, the wall clock time will be at the beginning of the line. Next will be the elapsed time, followed by the input text from the pipe. You may have noticed that all the printf() statements always use $printline[0] as the text to print. You may recall that the @printline array may have 1 or 2 elements in it, dependent on whether -nocmdfirst or -cmdfirst was used on the command line. Line 4 ensures that the correct timing information is matched up with the input line by setting the array size.
Lines 46-51And finally these lines complete the job if -cmdfirst was used, as the final timing data is not known until the last line of input has been received and the program has exited the loop.
Other UsesSince first writing perltrace.pl I have found other areas to use it. It has recently proved useful for timing parts of RMAN and rsync operations.
If you have are running a lengthy RMAN job, pipe the output through perltrace and then to a log file. You can see in real time how much time each step requires. Here’s the output from an operation to recatalog some backup files that were moved. Even if you don’t need the information right away, it is always nice to have an idea of how long it takes to perform various database operations. I had no idea it would take so long to recatalog these backup pieces, as this was something I had never before done.
$ORACLE_HOME/bin/rman target / catalog rman/PASSWORD@RMANCAT cmdfile recatalog_PRODRAC.rman 2>&1 | ~/bin/perltrace.pl -wallclock | tee RMAN_recatalog.log 2011-07-13 10:27:34 0.000084 2011-07-13 10:27:34 0.000016 Recovery Manager: Release 10.2.0.4.0 - Production on Wed Jul 13 10:27:34 2011 2011-07-13 10:27:34 0.000012 2011-07-13 10:27:34 0.000011 Copyright (c) 1982, 2007, Oracle. All rights reserved. 2011-07-13 10:27:34 0.791405 2011-07-13 10:27:35 0.027189 connected to target database: PRODRAC (DBID=626198001) 2011-07-13 10:27:35 0.222686 connected to recovery catalog database 2011-07-13 10:27:35 0.012622 2011-07-13 10:27:36 0.010598 RMAN> catalog backuppiece '/mnt/archive/oracle/backups/archive2/PRODRAC2/PRODRAC_3981_1_sdmh3352_1_1_20110710.arc'; 2011-07-13 10:27:36 0.010762 2> catalog backuppiece '/mnt/archive/oracle/backups/archive2/PRODRAC2/PRODRAC_3982_1_semh33ne_1_1_20110710.arc'; 2011-07-13 10:27:36 0.011331 3> catalog backuppiece '/mnt/archive/oracle/backups/archive2/PRODRAC2/PRODRAC_3983_1_sfmh34ct_1_1_20110710.arc'; ... 2011-07-13 10:35:09 0.000901 cataloged backuppiece 2011-07-13 10:35:09 21.824507 backup piece handle=/mnt/archive/oracle/backups/archive2/PRODRAC2/PRODRAC_3985_1_shmh358r_1_1_20110710.arc recid=11515 stamp=756383709 2011-07-13 10:35:31 21.281820Even more recently I was using rsync to relocate a few hundred megabytes of archived logs to a new location. By piping the output through perltrace I could see how time was required to copy each file and then make a reasonable estimate as to how much more time was required to complete the job. This was useful in forming an estimate of how much time it would take to copy all the files.
oracle> rsync -av --update /mnt/rac/oracle/archive/PRODRAC /mnt/archive/oracle/archive |~/bin/perltrace.pl 0.897609 building file list ... done 0.848139 PRODRAC/ 8.644107 PRODRAC/1_1610_755849488.dbf 10.221037 PRODRAC/1_1611_755849488.dbf 20.903755 PRODRAC/2_1401_755849488.dbf 22.600272 PRODRAC/2_1402_755849488.dbf 21.960984 PRODRAC/2_1403_755849488.dbf 22.321083 PRODRAC/2_1404_755849488.dbf 22.843627 PRODRAC/2_1405_755849488.dbf 20.304977 PRODRAC/2_1406_755849488.dbf ...If you have ever used tail -f to watch the oracle alert log for changes such as the application of archive logs during a database recovery, you may have wanted to know just how much time was required to apply each log. Oracle does provide date stamps before and after the log, but you are left calculating the time yourself. This can easily accomplished by piping the output of tail -f through perltrace. I don’t have any results handy for this one, but the command line is simple enough:
$> tail -f alert_orcl.log | perltrace -wallclockThe first few lines will of course have incorrect timing information, as those lines already exist in the file. This method works for any log file that is continuously written to.
Download the Perltrace Scripts
最近评论
30 周 3 天 前
31 周 1 天 前
32 周 2 天 前
32 周 3 天 前
32 周 5 天 前
33 周 4 天 前
33 周 5 天 前