QUESTION:

In HF analysis, the computation run time for 8.0 is much slower then 6.1 widely. Why does this happen and is there a workaround?


ANSWER:

We found 2 sources of the performance difference.
One is solver related; the other is related to element 120. I can tell you how to fix the solver issue and I'll
forward this information to the element developer for comment on the performance of el120.

After 6.1 we started optimizing memory usage for the solvers. The intent was to try and run problems
in memory ( incore ) rather than out-of-core whenever possible. Indeed, your 8.1 run was incore
while the 6.1 run ran out-of-core. But there is a gotcha with incore runs. ANSYS allows multiple loads with the same
factorization. To do this we save the sparse solver workspace so that on subsequent loads, as soon as we
determine that the boundary conditions are identical we can resume the sparse solver and just do
a forward/backward solve. This is much cheaper than doing 2 factorizations. For an out-of-core run the matrix
factor is already sitting in a file so all we have to save is the rest of the sparse solver work array. But, in
an incore run when we save the solver work array we end up saving the whole factor as well. The I/O
to do this is very slow on Windows. The way to avoid this is to use an undocumented option in the bcsopt command.
The tests that we ran using this option made a huge difference in solver time. Add this to your
input before the solve command:

bcsopt,,,,-1,,-5 - That's 4 commas, a -1 , and then 2 more and a -5. The -5 is there for performance
debug. The -1 surpresses writing that LN22 file for backup of the workspace.

Another way to get this done is to tell the solver to run in optimal out-of-core mode
rather than incore.. to do this use


bcsopt,,opti,,,,-5

I tried both solutions and got about the same results. Incore was faster but both were now
competitive with the 6.1 results. We also noticed that version


QUESTION:

In HF analysis, the computation run time for 8.0 is much slower then 6.1 widely. Why does this happen and is there a workaround?


ANSWER:

We found 2 sources of the performance difference.
One is solver related; the other is related to element 120. I can tell you how to fix the solver issue and I'll
forward this information to the element developer for comment on the performance of el120.

After 6.1 we started optimizing memory usage for the solvers. The intent was to try and run problems
in memory ( incore ) rather than out-of-core whenever possible. Indeed, your 8.1 run was incore
while the 6.1 run ran out-of-core. But there is a gotcha with incore runs. ANSYS allows multiple loads with the same
factorization. To do this we save the sparse solver workspace so that on subsequent loads, as soon as we
determine that the boundary conditions are identical we can resume the sparse solver and just do
a forward/backward solve. This is much cheaper than doing 2 factorizations. For an out-of-core run the matrix
factor is already sitting in a file so all we have to save is the rest of the sparse solver work array. But, in
an incore run when we save the solver work array we end up saving the whole factor as well. The I/O
to do this is very slow on Windows. The way to avoid this is to use an undocumented option in the bcsopt command.
The tests that we ran using this option made a huge difference in solver time. Add this to your
input before the solve command:

bcsopt,,,,-1,,-5 - That's 4 commas, a -1 , and then 2 more and a -5. The -5 is there for performance
debug. The -1 surpresses writing that LN22 file for backup of the workspace.

Another way to get this done is to tell the solver to run in optimal out-of-core mode
rather than incore.. to do this use


bcsopt,,opti,,,,-5

I tried both solutions and got about the same results. Incore was faster but both were now
competitive with the 6.1 results. We also noticed that version7.1 of ANSYS ran the sparse complex
solver faster than 8.0. But it is still comparable with 6.1, even as is.

The other performance problem is that in writing the element results the time for el120 has significantly increased.
In 6.1 we got this..

8 hLSSet2 76416 0.297 0.000004 0.00 0.000 1.00
8 el120 7641628.328 0.000371 29.00 0.000 1.008 elnfor 76416 0.375 0.000005 0.00 0.000 1.00
8 elenrg 76416 0.594 0.000008 1.00 0.000 1.00

In 8.1 we got this timing of the el120 element..

8 hLSSet2 76416 0.250 0.000003 0.00 0.000 1.00
8 el120 76416 118.656 0.001553 0.00 0.000 1.00
8 elnfor 76416 0.781 0.000010 0.00 0.000 1.00
8 elenrg 76416 0.484 0.000006 0.00 0.000 1.00


For now, I'd recommend trying either of the bcsopt solutions described above.





Show Form
No comments yet. Be the first to add a comment!