Jim, I am willing to give you access to the system if you are willing to take a look. Thanks, Al
On Wed, Apr 26, 2023 at 11:51 PM 'Jim Idle' via jBASE < [email protected]> wrote: > Say hello to Bob for me. :) > > It is common for companies to try and move away. It’s also common for that > to fail if it is a rewrite. But I understand why you e not upgraded. > > I think you do need to try a restart. One user may be a red herring here > as if there are background programs running, then it could be one of those > programs causing all the writes. Difficult to say without a look at the > system itself. > > On Wed, Apr 26, 2023 at 21:53 Alan Metz <[email protected]> wrote: > >> Jim, >> Well, I have been working with a friend of mine, Bob Wyatt; however, >> we have not been able to find anything definitive. >> The sync d runs every 60 seconds. >> As far as system workload, we have about 300 users, many with >> multiple Accuterm sessions at once, so yes, there is much activity >> throughout the day. What bothers >> me is the fact that I can duplicate the problem with only 1 user >> attached! (The one thing that I did not try was to eliminate all users, >> restart AIX and jBASE, and see if the >> problem occurs, without anyone running any jBASE programs - to your point >> about software code changes. I have been coding our system for over 30 >> years now, so yes, >> some code changes have happened this year...but that is normal operations) >> At this point, I would be inclined to make use of your services to >> see if you can find anything! (I do want you to be aware that my >> company has decided, 2 years ago, to >> move away from Pick/jBASE in early 2024 - much to my dismay - which is >> why I haven't upgraded the system.) >> Let me know if you are interested. >> Thanks, >> Al >> >> On Wed, Apr 26, 2023 at 1:42 AM 'Jim Idle' via jBASE < >> [email protected]> wrote: >> >>> Any progress on this? >>> >>> On Fri, Apr 21, 2023 at 3:40 PM Jim Idle <[email protected]> wrote: >>> >>>> So, I can no longer find much documentation on AIX 6.1 because it is >>>> end-of-lifed. But I think that you can determine if this is the cause of >>>> your problem by: >>>> >>>> edit the file /sbin/rc.boot using sudo (it may be /etc/rc.boot on your >>>> version of AIX), and find where it starts the syncd daemon. Start out with >>>> 5 seconds - in some cases you can make this shorter. It looks like the -i >>>> option is what you need: >>>> >>>> start /usr/sbin/syncd -i 5 >>>> >>>> You can change it for the current system without making it permanent by >>>> killing the syncd and restarting it with a new seconds value. That way you >>>> can try different values until you get one that suits your system. >>>> >>>> In the AIX 7.1 documentation it also recommends to turn on >>>> the random write behind function using the ioo command, but I suspect that >>>> that is not there on AIX 6.1. >>>> >>>> I suspect that things hand while this is happening because by default, >>>> the process causes locks to be held against the inodes (jBASE files) that >>>> have dirty writes outstanding. In AIX 7.1, you can prevent syncd from >>>> locking the inode with ioo -o sync_release_ilock=1. See if that is >>>> also an option in 6.1. >>>> >>>> Please let us know if that helps. >>>> >>>> However, if it does help, then you need to work out why this is now an >>>> issue when it was not before. I can only think that there has been a change >>>> to your application software, but that is speculation of course. >>>> >>>> Jim >>>> >>>> On Thu, Apr 20, 2023 at 11:25 AM Jim Idle <[email protected]> wrote: >>>> >>>>> Ah, right. I think you can rule out the network then, as you are >>>>> seeing intermittent stutter in actual program response time here. That's >>>>> kicked out a lot of issues. To be honest, I should have recognized this as >>>>> an XY problem from the start - my apologies. >>>>> >>>>> Now, I used to be a dab hand at tuning AIX, especially with jBASE of >>>>> course, but it's been a "number of years" ;) >>>>> >>>>> So, something has changed, but unless you changed some of your >>>>> application software, then it is something that has happened over time >>>>> that >>>>> has now hit a bottleneck.There have been a few good suggestions here >>>>> already, so i will assume that you have looked at those by now. >>>>> >>>>> So, the main thing I remember was that there is a kernel tuning >>>>> parameter for memory flushing of dirty memory buffers, which was (and I >>>>> assume still is) controlled by a flush daemon, which I think used to be >>>>> syncd or flushd. The parameter controls how often this daemon runs. >>>>> >>>>> Right now, this is my first guess as to what is happening as this was >>>>> always the answer back in the day. And, guess what? The default time for >>>>> this demon to run is either 30 or 40 seconds, which seems to fit the bill >>>>> >>>>> The scenario is as follows: >>>>> >>>>> - Someone thinks that this kernel parameter should be high and >>>>> changes it such that the system doesn't try to flush dirty memory to >>>>> disk >>>>> until it gets a lot of dirty buffers >>>>> - Nothing seems to change right away, but one day your workload >>>>> changes slightly and... >>>>> - The syncd (or whatever it is these days) wakes up every 30 >>>>> seconds or so, sees that 70% of your memory is in need of being >>>>> written and >>>>> it tries to do that all at once in one massive glob of writes - >>>>> everything >>>>> else has to pause and wait. >>>>> - The actual setting should be that the flush cycle runs more >>>>> often, not less often so that you get a smooth, averaged out >>>>> performance. >>>>> - The setting, especially on a write busy system, should be about >>>>> 5 seconds >>>>> >>>>> This was performance problem #1 with jBASE on AIX. AIX is generally a >>>>> great system, but tuning it is a bit of a nightmare sometimes. I used to >>>>> have a whole instruction set for people in the field to do this, but I >>>>> don't have access to that and haven't for a long time. I don't know if >>>>> maybe someone like *Bruce Decker *has a copy of that email - he >>>>> might. If not, then we will need to find out what I used to do starting >>>>> from first principles. >>>>> >>>>> The daemon is either flushd or syncd (it is called different things on >>>>> different systems). As I say, the default is 30 seconds or >>>>> something similar. You want this to run MORE often, not LESS often. Also, >>>>> have a think about whether the system workload has changed in terms of >>>>> writes. More users? Extra business? Someone changed the background tasks >>>>> to >>>>> do more writes? >>>>> >>>>> I will try and find my notes etc about this, but while I cannot >>>>> guarantee that this is your issue, I would be willing to bet a pint on it. >>>>> We would need to run some vmstat and related commands to put this >>>>> together, >>>>> but I bet if you ran that command at the same time as your script that >>>>> measured above, that you will find that the delay corresponds to a massive >>>>> spike in disk writes. >>>>> >>>>> BTW, your system is quite a bit out of date; AIX has been basically >>>>> end-of-lifed and we are on AIX 7.1 now I think. I would recommend >>>>> upgrading, and probably moving to AWS rather than physical hardware. Also, >>>>> upgrade jBASE and switch to the file type that does not need any sizing >>>>> maintenance. My own tests show those files to be the fastest we ever had. >>>>> I >>>>> don't know how many users you have, but even if you wanted local hardware, >>>>> I think it would be a trivial cost to move to a decent rack based modern >>>>> system with Linux. probably save the money on power costs! >>>>> >>>>> There is no work out there in the world right now, so if this is a big >>>>> issue for you, then I am available for hire on a no win no fee basis ;) >>>>> >>>>> Jim >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Apr 19, 2023 at 2:58 AM Alan Metz <[email protected]> wrote: >>>>> >>>>>> Well... >>>>>> I did some more testing. >>>>>> btw AIX 6.1, no changes to AIX, and yes, using telnet >>>>>> >>>>>> I assume that you have ruled out network configuration changes? (We >>>>>> did increase bandwidth across the entire network recently) >>>>>> >>>>>> >>>>>> I removed the SD-WAN network from the equation over the weekend. >>>>>> I attached my laptop to a switch and the server ONLY to the same >>>>>> switch - I did notice the delay. >>>>>> >>>>>> Tried same setup with a different switch - noticed delay >>>>>> Tried different Ethernet cable from server to switch - noticed delay >>>>>> (I wrote a program to track the frequency by Executing a LISTPEQS and >>>>>> recording the time it took to render the results, if greater then 1 >>>>>> second >>>>>> I tracked the time - most >>>>>> iterations are less than 1 second. What I found out was that it >>>>>> appears a ~40 second delay occurs approximately 5 to 6 minutes apart. >>>>>> (there were a few 2 to 3 second pauses >>>>>> between that I excluded)) >>>>>> Event Date Start Time End Time Delay Seconds >>>>>> 04/18/2023 11:42:10AM 11:42:53AM 43 >>>>>> 04/18/2023 11:47:56AM 11:48:39AM 43 >>>>>> 04/18/2023 11:53:33AM 11:54:16AM 43 >>>>>> 04/18/2023 11:59:28AM 12:00:12PM 44 >>>>>> 04/18/2023 12:05:13PM 12:05:54PM 41 >>>>>> 04/18/2023 12:11:03PM 12:11:39PM 36 >>>>>> 04/18/2023 12:16:53PM 12:17:34PM 41 >>>>>> 04/18/2023 12:22:34PM 12:23:17PM 43 >>>>>> 04/18/2023 12:28:34PM 12:29:16PM 42 >>>>>> 04/18/2023 12:34:05PM 12:34:48PM 43 >>>>>> 04/18/2023 12:39:50PM 12:40:34PM 44 >>>>>> 04/18/2023 12:45:43PM 12:46:26PM 43 >>>>>> 04/18/2023 12:51:26PM 12:52:09PM 43 >>>>>> 04/18/2023 12:57:05PM 12:57:49PM 44 >>>>>> 04/18/2023 01:02:33PM 01:03:16PM 43 >>>>>> 04/18/2023 01:08:26PM 01:09:09PM 43 >>>>>> 04/18/2023 01:14:22PM 01:15:05PM 43 >>>>>> 04/18/2023 01:19:58PM 01:20:42PM 44 >>>>>> 04/18/2023 01:25:42PM 01:26:27PM 45 >>>>>> 04/18/2023 01:31:41PM 01:32:25PM 44 >>>>>> >>>>>> My question is can I somehow determine if a background process is >>>>>> causing the hangs? I do have Phantoms jobs in Jbase running; however, >>>>>> The >>>>>> code has not changed in years and no new Phantoms >>>>>> have been added. >>>>>> >>>>>> I have added more users on the network over time, but removing the >>>>>> network as mentioned above was tested. Unfortunately, I didn't write the >>>>>> tracking program until Monday, after the "removing the network" test. >>>>>> (I will say that the delays didn't appear to be as frequent with just >>>>>> me and the server test - I suppose I could test that this weekend...) >>>>>> >>>>>> I wish I could provide more information, but I don't know what else >>>>>> to test?? >>>>>> Thanks, >>>>>> Al >>>>>> >>>>>> >>>>>> On Sat, Apr 15, 2023 at 11:42 AM Kannan Seshadri <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> Is it possible for you to execute whatever you are executing >>>>>>> directly on the AIX console with a telnet session? This will clearly >>>>>>> tell >>>>>>> you whether you have a network issue or not? >>>>>>> >>>>>>> Thanks and Regards >>>>>>> >>>>>>> On Sat, Apr 15, 2023 at 6:42 PM Bruce Decker <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Is the delay between the login prompt and the password prompt? As >>>>>>>> jimi asked, more details. >>>>>>>> >>>>>>>> Sent from my iPhone >>>>>>>> >>>>>>>> On Apr 15, 2023, at 8:45 AM, Jim Idle <[email protected]> wrote: >>>>>>>> >>>>>>>> >>>>>>>> I think more details are needed Alan. >>>>>>>> >>>>>>>> What version of AIX are you running? >>>>>>>> Are you really using telnet and not ssh? Telnet is unlikely to be >>>>>>>> maintained. >>>>>>>> I assume that you have ruled out network configuration changes? >>>>>>>> Any upgrades to AIX lately? >>>>>>>> Any change to the network load? New devices? >>>>>>>> >>>>>>>> In the absence of any changes, then I would definitely be looking >>>>>>>> at network problems. When you say you tried with just one user, do you >>>>>>>> mean >>>>>>>> literally one device and the server only on the network? If there is a >>>>>>>> faulty system somewhere, or malware, then that would still eat your >>>>>>>> network >>>>>>>> Bandwidth. >>>>>>>> >>>>>>>> Finally, I presume you have done the obvious and rebooted the >>>>>>>> server and all the network gear? You’ll probably have to start from >>>>>>>> first >>>>>>>> principles with no devices on the network and gradually add them in. >>>>>>>> >>>>>>>> On Fri, Apr 14, 2023 at 21:12 Alan Metz <[email protected]> wrote: >>>>>>>> >>>>>>>>> All, >>>>>>>>> I have recently been experiencing sporadic response delays when >>>>>>>>> accessing Jbase, (version 5.6.0.2), from telnet sessions with all >>>>>>>>> users in >>>>>>>>> my company. At first I thought it was a network issue; however, I have >>>>>>>>> tested this with only one user and the Jbase server plugged into a >>>>>>>>> switch >>>>>>>>> and was able to duplicate the hesitation. I am not logging any errors >>>>>>>>> on my >>>>>>>>> AIX server that would indicate a hardware issue. I am not sure how to >>>>>>>>> further trouble-shoot this issue and am asking for suggestions. This >>>>>>>>> system has been rock solid since 2018. >>>>>>>>> Thanks, >>>>>>>>> Al >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>>>>>>>> >>>>>>>>> To post, send email to [email protected] >>>>>>>>> To unsubscribe, send email to [email protected] >>>>>>>>> For more options, visit this group at >>>>>>>>> http://groups.google.com/group/jBASE?hl=en >>>>>>>>> >>>>>>>>> --- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "jBASE" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/jbase/CAPPLyKCej9SMfqOPoBnnQLSUVSDWMEsP-1CFsCgMSZya0yS0NQ%40mail.gmail.com >>>>>>>>> <https://groups.google.com/d/msgid/jbase/CAPPLyKCej9SMfqOPoBnnQLSUVSDWMEsP-1CFsCgMSZya0yS0NQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>>>>>>> >>>>>>>> To post, send email to [email protected] >>>>>>>> To unsubscribe, send email to [email protected] >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/jBASE?hl=en >>>>>>>> >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "jBASE" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/jbase/CAGPPfg_KxpM_KaKWMJFJUenJq7vhwYdQsB4-7LKKWz0dasggyQ%40mail.gmail.com >>>>>>>> <https://groups.google.com/d/msgid/jbase/CAGPPfg_KxpM_KaKWMJFJUenJq7vhwYdQsB4-7LKKWz0dasggyQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>>>>>>> >>>>>>>> To post, send email to [email protected] >>>>>>>> To unsubscribe, send email to [email protected] >>>>>>>> For more options, visit this group at >>>>>>>> http://groups.google.com/group/jBASE?hl=en >>>>>>>> >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "jBASE" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/jbase/61B6FF9C-3382-473A-9505-A2F842E7A48D%40bluepinc.com >>>>>>>> <https://groups.google.com/d/msgid/jbase/61B6FF9C-3382-473A-9505-A2F842E7A48D%40bluepinc.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>>>>>> >>>>>>> To post, send email to [email protected] >>>>>>> To unsubscribe, send email to [email protected] >>>>>>> For more options, visit this group at >>>>>>> http://groups.google.com/group/jBASE?hl=en >>>>>>> >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "jBASE" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/jbase/CAOJugBvh7vH0Vrswnbu2B9oLCr0GeOAE_-%2B2Rz4qOBHTz43EvA%40mail.gmail.com >>>>>>> <https://groups.google.com/d/msgid/jbase/CAOJugBvh7vH0Vrswnbu2B9oLCr0GeOAE_-%2B2Rz4qOBHTz43EvA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>>> -- >>>>>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>>>>> >>>>>> To post, send email to [email protected] >>>>>> To unsubscribe, send email to [email protected] >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/jBASE?hl=en >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "jBASE" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/jbase/CAPPLyKCBBC0S7PCtcM-fLEwnFeQwSvwTfmEKaZZF0waPyYhW1g%40mail.gmail.com >>>>>> <https://groups.google.com/d/msgid/jbase/CAPPLyKCBBC0S7PCtcM-fLEwnFeQwSvwTfmEKaZZF0waPyYhW1g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>> -- >>> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >>> >>> To post, send email to [email protected] >>> To unsubscribe, send email to [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/jBASE?hl=en >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "jBASE" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/jbase/CAGPPfg80ktBdQUMAXgKjT3yuBXALvj1PTUFhCpw8g0EJpKzsuA%40mail.gmail.com >>> <https://groups.google.com/d/msgid/jbase/CAGPPfg80ktBdQUMAXgKjT3yuBXALvj1PTUFhCpw8g0EJpKzsuA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> -- >> IMPORTANT: T24/Globus posts are no longer accepted on this forum. >> >> To post, send email to [email protected] >> To unsubscribe, send email to [email protected] >> For more options, visit this group at >> http://groups.google.com/group/jBASE?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "jBASE" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/jbase/CAPPLyKDHNpFi3DhUnixUVVys%3DOT5mNwxVau6eyutQUYhsdqX%3DQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/jbase/CAPPLyKDHNpFi3DhUnixUVVys%3DOT5mNwxVau6eyutQUYhsdqX%3DQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- > -- > IMPORTANT: T24/Globus posts are no longer accepted on this forum. > > To post, send email to [email protected] > To unsubscribe, send email to [email protected] > For more options, visit this group at > http://groups.google.com/group/jBASE?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "jBASE" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jbase/CAGPPfg-hDj0waOM8qdvmrw9wpQQP3e%2BcP%3DZUjSXEo_8TD8oUxg%40mail.gmail.com > <https://groups.google.com/d/msgid/jbase/CAGPPfg-hDj0waOM8qdvmrw9wpQQP3e%2BcP%3DZUjSXEo_8TD8oUxg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- -- IMPORTANT: T24/Globus posts are no longer accepted on this forum. To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group at http://groups.google.com/group/jBASE?hl=en --- You received this message because you are subscribed to the Google Groups "jBASE" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jbase/CAPPLyKCXZTuXROJHbuPnxGNGCi4TtJZVxWJQw6X8oAdDkTVUig%40mail.gmail.com.
