Transaction timeout with long runnig process

Description

I ran a process to import about 20.000 lines of GL Journal entries from the "Import GL Journal" window. This process would have run about 2 hours but after about 1.5 hours it broke.

As you see the bigger part of the import table entries are done and some are not. When I started the process I said that it it has to rollback if there is an error. (I think this is another issue in the importer process.)

It should be possible to work with big datasets and let a process (and a transaction) work for more than 1.5 hours.

Environment

None

Attachments

2

Activity

Show:

Chuck Boecking September 12, 2019 at 12:40 PM

Hi , This ticket is closed; however, I am not sure what the resolution/prescription was. Here is what I believe it is (please correct if wrong):

  • There is still an issue with very large files that require more than 1.5hr to upload

  • The practical solution is to break the files into their pieces

  • If commonly importing large file, then consider using the i_table approach since you have much more control over actions, performance and transactions.

Thank you for your consideration! Chuck

Reference:

Carlos Ruiz December 24, 2014 at 3:27 PM

Hi, I'm attaching two patches.

IDEMPIERE-644_v1.patch

  • Rollback and close the transaction when there is a commit

  • DB.verifyTrx was changed to raise an exception if a transaction is not found (previously it just logged a message saying that works as autocommit) - this can have collateral consequences on wrongly written processes - but I think is a must to guarantee data consistency

IDEMPIERE-644_ToTest.patch

  • Changes the trx timeout to 2 minutes

  • Changes the monitor to run every minute

  • Creates a dummy process on VerifyLanguageConfiguration for testing purposes - I made it to insert into t_selection table within transaction every 5 seconds

The final result was that the transaction was correctly rolled back on timeout and the user is notified with an exception message "Transaction closed or never opened (SvrProcess_3720b31e-59c7-4296-9f01-fa89c7516c55) => (maybe timed out)"

More tests and feedback required.

There are some additional comments on the meeting:
http://www.globalqss.com/idempiere/irclog/%23idempiere.2014-12-24.log.html

Regards,

Carlos Ruiz

Norbert Bede December 24, 2014 at 3:00 PM

Please consider the next idea, when fixing this issue: https://idempiere.atlassian.net/browse/IDEMPIERE-1975

Thomas Bayen December 24, 2014 at 10:36 AM

I would say that in the very most cases a database transaction should not run for so long. That eats up database resources and makes iDempiere noticable slower for other users. Apart from the technical facts I think that a user does not expect that just pressing a button (without a special warning or such) runs for longer than some minutes. Perhaps with the exception of looping through many records - but then he expects a progress bar and a stop button.

Even if you have a really long running process you should try to cut it into single transactions (and give the user the opportunity to filter, to see a log of the done/not done work and to restart the undone work - e.g. like it is already implemented with the importer windows' "imported" field).

My idea is to have a much shorter default transaction timeout like 5 minutes. That is a maximum timespan that a normal user expects after pressing a button without an explicit warning. Then we can have a timeout column at ad_process to override that value if the developer really thinks that it is justified and he has written a warning in the description text.

Thomas

Deepak Pansheriya December 24, 2014 at 6:48 AM

Below is hengsin's suggestions. Recording on ticket

for the timeout ticket - 1) commit upon timeout is a bug, should rollback instead.
2) hard to calculate a good timeout - guess easier to just add trx timeout column to ad_process

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

Created February 19, 2013 at 1:08 AM
Updated September 12, 2019 at 12:40 PM
Resolved January 13, 2015 at 1:58 PM