Cadet-MATCH SMA Simulation Not Converging

thomas.stuart · July 16, 2021, 6:56pm

Hello,

I am attempting to run CADET-Match to fit Steric Mass Action adsorption parameters. However, I am running into a couple of issues that are keeping my simulation from converging. I was previously successful at converging my CADET-Match script to calibrate my system/column, so I am not sure why I am running into difficulty here with my binding model. Typically, my simulation stops running after Generation 0 without converging, or it will run for days for 20+ generations without reducing the RMSE value. After each simulation, I clear the results directory or create/use a new one, but I keep running into the same issue. Is there any explanation for why I am having issues running CADET-Match for calibration of my binding model parameters? Any advice/tips to resolve this issue would be greatly appreciated. I pasted the part of my output from the simulation below. Please let me know if there is any additional information that I need to provide. Thank you - Dwyer

2021-07-16 11:41:34,473 util.py process_population 864 Generation 0 approximately 79.2 % complete with 38/48 done
2021-07-16 11:41:34,633 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:44:23,925 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:44:24,114 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:47:13,317 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:47:13,319 util.py process_population 864 Generation 0 approximately 87.5 % complete with 42/48 done
2021-07-16 11:47:13,561 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:50:02,720 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:50:02,922 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:52:52,100 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:52:52,102 util.py process_population 864 Generation 0 approximately 95.8 % complete with 46/48 done
2021-07-16 11:52:52,311 util.py runExperiment 313 Simulation Timed Out
2021-07-16 11:52:52,699 loggerwriter.py write 10 C:\Users\thomas.stuart\Anaconda3\lib\site-packages\scipy\stats\stats.py:410: RuntimeWarning: divide by zero encountered in log

  log_a = np.log(a)

2021-07-16 11:52:52,706 gradFD.py search 89 starting coarse refine
2021-07-16 11:52:52,707 gradFD.py search 93 ending coarse refine
2021-07-16 11:52:52,936 loggerwriter.py write 10 C:\Users\thomas.stuart\Anaconda3\lib\site-packages\deap\tools\emo.py:606: RuntimeWarning: invalid value encountered in true_divide

  fn = (fitnesses - best_point) / (intercepts - best_point)

2021-07-16 11:52:53,006 loggerwriter.py write 10 Traceback (most recent call last):

2021-07-16 11:52:53,006 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\match.py", line 322, in <module>

2021-07-16 11:52:53,009 loggerwriter.py write 10     
2021-07-16 11:52:53,009 loggerwriter.py write 10 main(map_function=map_function)
2021-07-16 11:52:53,010 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\match.py", line 33, in main

2021-07-16 11:52:53,011 loggerwriter.py write 10     
2021-07-16 11:52:53,012 loggerwriter.py write 10 hof = evo.run(cache)
2021-07-16 11:52:53,012 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\evo.py", line 123, in run

2021-07-16 11:52:53,013 loggerwriter.py write 10     
2021-07-16 11:52:53,014 loggerwriter.py write 10 return cache.search[searchMethod].run(cache, tools, creator)
2021-07-16 11:52:53,014 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\search\nsga3.py", line 36, in run

2021-07-16 11:52:53,016 loggerwriter.py write 10     
2021-07-16 11:52:53,016 loggerwriter.py write 10 return checkpoint_algorithms.eaMuPlusLambda(
2021-07-16 11:52:53,017 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\checkpoint_algorithms.py", line 134, in eaMuPlusLambda

2021-07-16 11:52:53,018 loggerwriter.py write 10     
2021-07-16 11:52:53,019 loggerwriter.py write 10 progress.writeProgress(
2021-07-16 11:52:53,019 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\progress.py", line 484, in writeProgress

2021-07-16 11:52:53,021 loggerwriter.py write 10     
2021-07-16 11:52:53,021 loggerwriter.py write 10 update_results(
2021-07-16 11:52:53,021 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\CADETMatch\progress.py", line 283, in update_results

2021-07-16 11:52:53,023 loggerwriter.py write 10     
2021-07-16 11:52:53,023 loggerwriter.py write 10 hf["input"][-len(result_data["input"]) :] = result_data["input"]
2021-07-16 11:52:53,023 loggerwriter.py write 10   File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper

2021-07-16 11:52:53,024 loggerwriter.py write 10   File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper

2021-07-16 11:52:53,025 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\h5py\_hl\dataset.py", line 707, in __setitem__

2021-07-16 11:52:53,027 loggerwriter.py write 10     
2021-07-16 11:52:53,027 loggerwriter.py write 10 for fspace in selection.broadcast(mshape):
2021-07-16 11:52:53,028 loggerwriter.py write 10   File "C:\Users\thomas.stuart\Anaconda3\lib\site-packages\h5py\_hl\selections.py", line 299, in broadcast

2021-07-16 11:52:53,029 loggerwriter.py write 10     
2021-07-16 11:52:53,030 loggerwriter.py write 10 raise TypeError("Can't broadcast %s -> %s" % (target_shape, self.mshape))
2021-07-16 11:52:53,030 loggerwriter.py write 10 TypeError
2021-07-16 11:52:53,030 loggerwriter.py write 10 : 
2021-07-16 11:52:53,030 loggerwriter.py write 10 Can't broadcast (0,) -> (1, 4)
2021-07-16 11:52:53,039 util.py info 54 process shutting down

w.heymann · July 17, 2021, 7:01pm

Can you share the full log file, json file, simulations and csv files? I can then look at the system to see what is wrong.

thomas.stuart · July 19, 2021, 5:13pm

Hello William,

Attached are the requested log, JSON, simulation, and csv files. Due to space constraints, I only included the csv/h5 file for one of the three experimental runs that was used for fitting model parameters. Thank you for your assistance!

11-Run.csv (41.6 KB)
Match_Test.json (1.6 KB)
SMA_5.h5 (3.8 MB)
SMA_5_Final.csv (75.2 KB)
mainlog.txt (33.3 KB)

w.heymann · July 20, 2021, 12:55pm

It looks like the SMA_5.h5 file is somehow corrupted. It may be easier for you to just zip up the data and put it on something like google drive, one drive etc and share the link.

thomas.stuart · July 20, 2021, 1:40pm

Hello William,

Good idea. Attached is the link to the Google Drive folder with the relevant files.

w.heymann · July 23, 2021, 12:45pm

Sorry this has taken me a bit. I had a new laptop arrive to replace the one that failed and it has taken a bit to get things setup.

It looks like SMA_5.h5 is still damaged and won’t open in hdf5view but SMA_10.h5 worked fine and that was enough to find some problems.

One of the things that is missing which is slowing down the simulations and also causing some of your simulations is the column is equilibrated first. The bound salt concentration should be equal to SMA_LAMBDA and it helps if you set init_c first component to your inlet salt concetration.

Essentially without doing that you are saying the column is filled with no liquid phase ions and that the binding sites are all ionized because they have nothing bound to them.

Let me know if you still have problems.

thomas.stuart · July 26, 2021, 3:14pm

Hi William,
Thank you for your suggestion, after changing the inputs for the initial salt concentrations in the solid/liquid phases, the simulation was able to converge! I have two quick follow up questions:

First, how does CADET-Match select which set of parameters to use for “Best”? I noticed that for my simulation, the lowest RMSE values given in the results spreadsheet were not chosen as the best fit against my experimental data.
Second, if I wanted to further calibrate my model using new experimental runs, do I need to re-calibrate my model with CADET-Match fitting against all (old + new) experimental runs, or could I use my initially calibrated model simulation files and then fit those against only the new experimental runs?

w.heymann · July 27, 2021, 7:05am

The system constructs a Pareto front contained the geometric mean of the scores, the average of the scores and the lowest score and keeps that Pareto front as the best entries. It should also keep the lowest RMSE and SSE values. All of these values are stored in the meta folder that it creates. Everything in that folder is basically equally good but with a different compromise. If you use the IPython notebook interface you don’t see all of that and the IPython notebook interface is very limited. All of the actual data is still there though so even if you run it that way you can look at exactly what the system found.

In general the lowest RMSE is rarely the best fit. At least my experience so far is that the shape is more predictive of how the system behaves and a small time offset with the correct shape is a better fit than having the curve in the right place with the wrong shape. SSE and RMSE both favor peaks that are in the right place but wrong shape.

I would normally match against all runs at the same time. Because of things like pump delays, small changes in loading concentration etc if you do a simultaneous fit you get a better result and can better account for these errors. Fitting one at a time tends to just give you the results which fit one of the experiments well and poorly on the others.

thomas.stuart · September 2, 2021, 3:21pm

Hello William,

Thank you for your help assisting us with the SMA model and for taking the time to answer my questions. After successful completion of SMA model calibration, my group plans to expand our CADET model to include pH as an input parameter. Therefore, we have started working with the GIEX binding model. However, we are running into a consistent issue when operating CADET-Match for GIEX. After the completion of Generation 0, the model signifies that it is still running, but no further output is given. When investigating the “results” spreadsheet, I can see that while different combinations of variables are being used, they do not seem to affect the RMSE value.

Could you please look at the attached h5 files, experimental files, output directory, and CADET-Match file in the Google Drive folder linked here? If the issue is with the model, are there any additional models you would recommend us to use for pH as an input parameter?

Any advice for resolving this issue would be greatly appreciated.
Thank you, Dwyer Stuart

w.heymann · September 3, 2021, 7:48am

I will look at this either later today or on Monday. At a quick glance what I can see is that none of the parameters being changed are changing the output of the model at all. There is likely something wrong with the setup.

What you may want to try first is run the model in a notebook and print out the same output that you having CADET-Match look at /output/solution/unit_002/SOLUTION_OUTLET_COMP_002 and change some of the same input variables and see if you see any change at all in the solution.

w.heymann · September 6, 2021, 3:51pm

I found a few problems

CADETMatch is having a problem calculating the right offsets for the bound components. This can be easily worked around by just using index instead of component and bound. I will put the json at the end of this post
In experiments each experiment needs a different name and that is causing problems
init_q component 0 should probably be set to 200 so that it matches lamba
A significant (10x or so) speedup can be obtained by setting /input/model/solver/LINEAR_SOLUTION_MODE=2 in your simulations. There are enough unit operations in these simulations that cadet’s automatic detection of when to use parallel vs sequential solving that it defaults to parallel solving but since your simulations have no loops of any kind the sequential method is much faster.

JSON changes

    {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_ka_lin",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_ka_quad",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_ka_salt",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_ka_prot",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_KD_LIN",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_KD_QUAD",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_KD_SALT",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_KD_PROT",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
                "location": "/input/model/unit_005/adsorption/GIEX_NU_LIN",
            "min": 0.0001,
            "max": 10000.0
        },
        {
            "transform": "auto",
            "index": 2,
            "location": "/input/model/unit_005/adsorption/GIEX_NU_QUAD",
            "min": 0.0001,
            "max": 10000.0
        }