next message in archive
no next message in thread
previous message in archive
Index of Subjects
This is a very interesting explaination of zmailer processes... by the current zmailer maintainer in reply to a thread about zmailer speed. david potter ---------- Forwarded message ---------- Date: Wed, 30 Jun 1999 01:21:16 +0300 (EET DST) From: Matti Aarnio <mea@nic.funet.fi> To: Dani Pardo <dani@minerva.enpl.es> Cc: zmailer@nic.funet.fi Subject: Re: Newbie: Slow scheduling? > Mm.. it's true.. :) I've tried this perl script: > > #!/usr/bin/perl -w > > for($counter=0;$counter<500;$counter++) > { > open MAIL, "|mail dani"; > print MAIL "Hi!"; > close MAIL > } > > If I run it under the Ultra-1 and sendmail, it never ends.. because it > keeps delivering them one by one and doing all the job. > From the Pentium Pro running ZMailer, I can set counter to 5000, and > have all them delivered in 2 or 3 minutes. Lets see what happens under ZMailer: - Pipe runs a shell which finds 'mail', which runs '/usr/lib/sendmail'. (the sub-shell does iterative attempt thru PATH environment as it can't use any cache -- there isn't cache..) - Mail collects input into a temporary file, and gets EOF - Mail runs 'sendmail', and feeds the temporary file to it, and in the end just closes the feed pipe, plus removes that temporary file. (two directory related creation/deletion transactions for 'mail') Finally 'mail' terminates. - ZMailer's 'sendmail' creates temporary spool file in $POSTOFFICE/public/ where the message collection happens; once the 'sendmail' gets an EOF, it closes that file, and renames it into $POSTOFFICE/router/ directory. Actually mail_open() does file creation + one rename before returning the FILE* handle to the file. (three dirops) The 'sendmail' program terminates. - Router acquires a lock on the spool file in $POSTOFFICE/router/ by means of doing rename() (one dirop) - Router creates temp file for transport specs in that directory (one dirop) - Router closes the created TA spec file, and renames spoolfile, and the fresh TA-spec file to $POSTOFFICE/queue/, and $POSTOFFICE/transport/ directories respectively. (two dirops) - Scheduler picks new TA-spec file, and moves it into hashed subdirs (and the spool-file likewise). (two dirops) ("-H" option in use) - Delivery is run; because work is so much alike previous, very likely it happens by serializing access to user's mailbox so that multiple transport agents don't need to do ``thundering herd'' on fcntl() (or whatever) lock on the mailbox file. - Delivery completes, scheduler deletes the spoolfile, and TA-spec from respective directories (two dirops) So, for each delivery, ZMailer does at least 13 directory MODIFYING operations, moving from one directory to other is here assumed equally costly to simple rename within one directory, or file creation and unlink. (Minimum count for dirops is 8, if a smartly written spool-in program is used, and "-H" isn't used for the scheduler. With considerable programming one more can be taken away from the router lock acquisition by changeing the locking methodology for parallel routers.) Now I do tend to think that those directory transactions are the most expensive part of the email delivery in the usual UNIX environment. In this sense 'sendmail' does pretty lightweight set of dirops; mere 6, I think (creating and deleting 3 files in 'mqueue' directory.) If the directory data is kept in synchronous writing state, then the speed of doing dirops depends directly with the speed of disk-seek, and physical write. (For read the data is rather hot in buffer cache..) That said, 5000 messages in 3*60 seconds means about 28 deliveries per second, or about 360 dirops per second! You must have very recent Linux at that PPro of yours, which has asynchronous metadata.. (Doing same in 2*60 seconds would mean about 540 dirops per second; mere 2 milliseconds per dirop ... ok, just and just possible with synchronously written metadata, if the system doesn't need to actually write/read file data content (all in cache), and your disks are brand new speed demons, or you operate on RAMDISK..) > So for each message, the delay is higher in zmailer, but the speed (in > messages per second) under ZMailer is *much* faster. The term you are looking for is: latency Overall system throughput is quite impressive, but it still can have some inherent latency built in -- due to periodic collection of new jobs into the scheduler. > Thanks for the info, It has been very helpful. > Keep on the good job, > --- > Dani Pardo, dani@enpl.es > Enplater S.A > > PS: I think I'll forgive all that R$*<@$%y>$* $#ether $@$2 $:$1<@$2>$3 :-) That is sendmailism, we have other mechanisms. /Matti Aarnio <mea@nic.funet.fi>
next message in archive
no next message in thread
previous message in archive
Index of Subjects