bergie: BTW, the client is asking when we expect to get n.n.reg working [10:55] bergie: Can you figure out *any* workaround? [10:55] torben: you won't like this: i'm really out of ideas right now [10:56] torben: this segfault has so much randomness, that i even can't track it down to specific constellation [10:56] bergie: what if I just surround MidCOM with one mgd_auth_midgard("admin") call? [10:56] torben: just 5 minutes ago, i registered for an event with a new account creation on the cached m-t host without the slightest problem, now i'm getting segfaults again [10:56] bergie: would n.n.reg then skip its own auth calls? [10:57] torben: ? [10:57] torben: i don't think that the auth calls are the root of this [10:57] torben: all these auth calls work [10:57] torben: the segfaults are at the end of the request [10:57] torben: midcom completly runs through, 100%. db-changes, everything [10:57] torben: look at this: [10:57] torben: Sep 01 09:56:45 [debug] midcom_helper__cache: Sent HEader: ETag: d41d8cd98f00b204e9800998ecf8427e [10:57] torben: Sep 01 09:56:45 [debug] midcom_helper__cache: We are on no_cache, flushing output buffer and exitting [10:57] torben: Sep 01 09:56:45 [debug] midcom_helper__cache: END OF MIDCOM REQUEST [10:57] bergie: yeah, but all cases of segfaults in Midgard that I've seen mgd_auth_midgard() has been the root cause [10:57] torben: [Wed Sep 1 09:56:45 2004] [notice] child pid 17673 exit signal Segmentation fault (11) [10:58] torben: if you take mgd_auth calls out of this component, it won't work either [10:58] bergie: how many times do you mgd_auth_midgard() or mgd_unsetuid() in those page views? [10:58] torben: at most twice as far as i see it. [10:59] torben: once for authenticating the user that might be logged in, a second time for doing the actual db changes [10:59] bergie: that can be the problem. OpenPSA has a Midgard segfault, too [10:59] bergie: or at least had [10:59] torben: i can't disable these auth's [10:59] bergie: When we update an OpenPSA user record Midgard crashes quite soon. So the page just calls flush(); exit(); after the update [11:00] torben: well, let me put it that way [11:00] torben: if i disable all auth calls in the component, you have to ensure, that the component on-site runs with the privileges it requires to creat persons and events in the system [11:00] torben: which means sg-admin privileges essentially [11:01] torben: i just can't exit in the midst of the request, your customer does want a user interface, does he? [11:02] bergie: obviously. but you could try making a redirect and quitting [11:02] torben: *phone* [11:02] Piotras joined the chat room. [11:02] Piotras: hi all [11:03] bergie: hi Piotras [11:03] Piotras: bergie: there is old copyright info for midgard php module [11:03] bergie: Piotras: I just remembered that OpenPSA has this same crash [11:03] bergie: Piotras: also when we edit an user [11:04] torben: re [11:04] torben: is just writing a mail to dev [11:04] bergie: 36: mgd_auth_midgard( $system_user, $system_pass, 0 ); [11:04] bergie: 37: $midgard = mgd_get_midgard(); [11:04] bergie: 50: if (is_object($person)) { [11:04] bergie: 51: $res = $person->update(); [11:04] Piotras: bergie: should I point © 2004 to midgard community? [11:04] bergie: yep [11:05] bergie: and in the snippet that called this snippet, we just have flush(); [11:05] Piotras: ok , I will do , by now I have mess in sources [11:06] Piotras: bergie: I am not sure if what I wrote is 100% true , but what I found with google is exactly the same as midgard segfaults [11:07] Piotras: I am afraid it is "unsolutionable" [11:07] bergie: arg [11:07] bergie: so what can we do? [11:08] torben: bergie: i'm just checking this flush() / exit() combo [11:09] torben: bergie: that _might_ be a workaround here [11:09] torben: bergie: i think that's it [11:09] torben: what happens now is: [11:09] bergie: torben: basically, on those page views where you update a person record, you should only output something very simple, like redirect and *not* make *any* Midgard queries after the update() [11:09] torben: 1. midcom processes the request, with or without buffering should be irrelevant here [11:10] torben: 2. midcom comes to its end, the cache is updated (or not, depending on settings) [11:10] torben: up to this point we are in a state like this [11:10] torben: output has been generated, and is somewhere in the "queue" between php and apache, but not yet sent to the client. [11:11] torben: up to now, the system would simply exit and segfault there, with this code still in this "cached" state, meaning not sent to the client [11:11] torben: if you now insert a flush() before the exit(), you get this: [11:11] torben: 3. the flush call forces php/apache to put the component's output into the network up to the client. This is essentially a blocking call until the client has recieved and confirmed the data, as far as i understand flush(). [11:12] torben: So this might be a DOS condition in some cases where the client doesn't confirm received data [11:12] torben: but [11:12] torben: just this will give us an unique advantage [11:12] torben: if PHP/Apache segfaults after this line, we are sure that all data that has been generated so far makes it to the client. [11:13] Piotras: torben: look at my mail at dev [11:13] torben: the prove i had just now, three requests, each with segfaults, where the client acutally go the generated data, so that the client didn't even "see" that the server had much trouble [11:13] Piotras: torben: about exit [11:14] bergie: torben: ok, so flush(); solves this? [11:14] torben: hm [11:14] torben: i think so, yes [11:14] torben: the trick now is, that i have to find all places where we exit() [11:15] bergie: well, that at least buys us time to work on this segfault, as I can hopefully get the angry client off my back [11:15] torben: i fear, that it doesn't work on Location http redirects right now [11:15] Piotras: torben: I think that we could check sources to call mysql_free_result everywhere it is needed , and do not call mysql_free_result and the end of request [11:15] bergie: torben: then we must JS redirect, I guess [11:16] torben: sounds sensible [11:16] torben: bergie: can use html here, no need for jS [11:16] bergie: ok [11:16] torben: bergie: actually, i think we have a different problem here [11:17] torben: we do, yes [11:17] torben: brb [11:17] bergie: wonders that since this seems to be a Zend engine bug whether PHP5 upgrade would fix it [11:18] bergie: torben: ?? [11:18] Piotras: bergie: BTW , did you noticed somehow my scandinavian languages skills? :) [11:18] bergie: Piotras: yes, the "perkele" comment was impressive ;-) [11:18] Piotras: :-D [11:18] bergie: torben: I'd like to mail something about this to the client [11:18] Piotras: bergie: look on google , the same segfault with PHP5 [11:19] bergie: ouch [11:19] Piotras: very random ghost segfault [11:19] bergie: Lets switch to .Net ;-) [11:19] Piotras: ;) [11:19] bergie: then we could at least blame Microsoft instead of these "shabby Open Source guys" [11:20] Piotras: bergie: I can try to look for solution this week , but I make release delayed [11:20] Piotras: lol [11:20] torben: bergie: .net at least has a decent debugger and safe garbage collection [11:20] bergie: Piotras: thanks! [11:20] torben: bergie: as for the flush solution [11:21] torben: it works in cases where we have a segfault at the end of the request [11:21] bergie: but? [11:21] torben: i'm just trying to create a new account and getting a sf there [11:21] torben: the browser appearantly follows the redirect, as far as i see it, and the new request sf's after the datamanager has created a lock [11:21] torben: i just don't know where exactly [11:23] bergie: m-p.org is down? [11:23] torben: ? [11:23] bergie: hmm... cache for front page returned nothing. now it is ok [11:23] torben: *args* [11:23] torben: runs away screaming [11:25] torben: i have an emergency [11:25] torben: is off [11:25] torben: bbl later this day [11:25] torben left the chat room. ("Leaving") [11:26] Kaukola: bergie: see you soon... [11:26] Kaukola left the chat room. [11:40] bergie: Piotras: do you still have the link to the stuff you found on google about this segfault? [11:41] bergie: Piotras: http:/​​/​​bergie.iki.fi/​​blog/​​2004/​​2004-09-01-000.html [11:44] Piotras: http:/​​/​​aspn.activestate.com/​​ASPN/​​Mail/​​Message/​​php-Dev/​​2027706 [11:44] Piotras: compare backtrace to these ones reported by Torben and me 11:45] Piotras: http:/​​/​​groups.google.pl/​​groups?q=mysql_free_result+​​segfault​&hl=pl​&lr=​&ie=UTF-8​&selm=c34pre​​%241j1o​​%241​​%40FreeBSD.csie.NCTU.edu.tw​&rnum=1