Cirrhosis is a late stage of scarring (fibrosis) of the liver caused by many forms of liver diseases and conditions, such as hepatitis and chronic alcoholism. The following data contains the information collected from the Mayo Clinic trial in primary biliary cirrhosis (PBC) of the liver conducted between 1974 and 1984. A description of the clinical background for the trial and the covariates recorded here is in Chapter 0, especially Section 0.2 of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. A more extended discussion can be found in Dickson, et al., Hepatology 10:1-7 (1989) and in Markus, et al., N Eng J of Med 320:1709-13 (1989).
A total of 424 PBC patients, referred to Mayo Clinic during that ten-year interval, met eligibility criteria for the randomized placebo-controlled trial of the drug D-penicillamine. The first 312 cases in the dataset participated in the randomized trial and contain largely complete data. The additional 112 cases did not participate in the clinical trial but consented to have basic measurements recorded and to be followed for survival. Six of those cases were lost to follow-up shortly after diagnosis, so the data here are on an additional 106 cases as well as the 312 randomized participants.
Click here to go to original research paper. Based on that research, Stage4 is Cirrhosis.
Stage 1 shows a florid, asymmetric destruction of the septal and interlobular bile ducts and what are typically surrounded by dense infiltrates of mononuclear cells, especially T lympocytes.....
In stage 2 there are more widespread lesions with a reduction of normal bile ducts and increased numbers of atypical, poorly formed bile ducts. Diffuse portal fibrosis is seen and as in stage 1 periportal cholestasis is conspicuous.
Stage 3 displays more progressive lesions with fibrous septa forming bridges.
Stage 4 represents the end stage with clear cirrhosis and may be difficult to distinguish from other types of cirrhosis.
Brief: The data has only 418 data points and almost 1/3 of rows contain null values. Using dropna()
methods is not a good way for this small data set.
Brief: As each stage's features have their own distributions, if I simply use the whole data's mean, mode, median to fill the null value, the data might have severe bias. To do so, I filtered each stage to make sure the null values have been filled in proper data. Furthermore, using this way to fill in null values could help classifier models achieve higher accuracy.
Brief: People whose liver disease is in stage 4 have higher chance to performance symptom of Ascites, spiders, Hepatomegaly, Edema. From the right picture above, if the color range is quite large, it means features and the target doesnt have liner relationships. And Age vs Stages seems has a fair positive relationship. And I decide to drop 'Cholesterol','Alk_Phos','SGOT','Tryglicerides','Sex','N_Days'
Brief: Using for-loop method on supervised-ML and NNM find out the best prediction model
Click here to go to my deplyed Falsk app.
-
Create your own Flask app
-
Sign up an account in Heroku and connect to Github. Click here
-
Sign up an account in MongoDB Atlas and some setup for deploying. Click here
-
Include a runtime.txt which contain the python version you want to use. Click here
-
Change script regarding to MongoDB connection. Click here
-
Pip install Pigar and execute
pigar generate
in correct directory and you will get a list of tools which you used in that directory. For correctly deploying in Heroku, please make sure all item in the list has this structure: library== version (Ex. gunicorn==20.0.4) -
Create a Procfile(help Heroku execute the app.py) with one line code
web: gunicorn app:app
-
Now you should be able to run app.py remotely on Heroku if you follow 6 steps above correctly.
Project
├── CirrhosisPrediction.ipynb
├── Cirrhosis Prediction.pptx
├── Output Data
│ ├── Bar_Categors.png
│ ├── Classifier.png
│ ├── Classifier_Results.png
│ ├── Data_Structure.png
│ ├── FinalModel.png
│ ├── Info.png
│ ├── NNM.png
│ ├── NNM_Results.png
│ ├── NNResult.csv
│ ├── Regplot.png
│ ├── Stage1.png
│ ├── Stage2.png
│ ├── Stage3.png
│ ├── Stage4.png
│ ├── clfTestResult.csv
│ ├── model.pkl
│ └── scaler.pkl
├── Procfile
├── README.md
├── Resources
│ └── cirrhosis.csv
├── app.py
├── requirements.txt
├── runtime.txt
├── static
│ ├── css
│ │ └── style.css
│ ├── images
│ │ ├── blur-hospital.jpg
│ │ └── cirrhosis.jpg
│ └── js
│ ├── anime.js
│ └── app.js
└── templates
├── base.html
├── error.html
├── index.html
├── predict.html
└── record.html
pip install -r requirements.txt
- https://www.mayoclinic.org/tests-procedures/bilirubin/about/pac-20393041
- https://www.mayoclinic.org/diseases-conditions/high-blood-cholesterol/symptoms-causes/syc-20350800
- https://medlineplus.gov/lab-tests/albumin-blood-test/#:~:text=Albumin%20is%20a%20protein%20made,and%20enzymes%20throughout%20your%20body.
- https://medlineplus.gov/lab-tests/ceruloplasmin-test/#:~:text=What%20is%20a%20ceruloplasmin%20test,your%20body%20that%20need%20it.
- https://medlineplus.gov/lab-tests/alkaline-phosphatase/
- https://www.healthline.com/health/sgot-test
- https://www.mayoclinic.org/diseases-conditions/high-blood-cholesterol/in-depth/triglycerides/art-20048186#:~:text=Triglycerides%20are%20a%20type%20of,triglycerides%20for%20energy%20between%20meals.
- https://www.urmc.rochester.edu/encyclopedia/content.aspx?ContentTypeID=160&ContentID=36
- https://medlineplus.gov/lab-tests/prothrombin-time-test-and-inr-ptinr/#:~:text=Prothrombin%20is%20a%20protein%20made,to%20form%20a%20blood%20clot.
- https://dev.to/vulcanwm/environment-variables-in-heroku-python-385o
- https://www.freecodecamp.org/news/how-to-deploy-an-application-to-heroku/
- https://www.fosslinux.com/50303/deploy-mongodb-on-heroku.htm
- https://www.kaggle.com/code/yashnegi01/78-accuracy-gradientboost-rf-xgb
- http://www.learningaboutelectronics.com/Articles/How-to-specify-the-Python-runtime-version-in-heroku.php
- https://coding-boot-camp.github.io/full-stack/mongodb/deploy-with-heroku-and-mongodb-atlas
- https://www.kaggle.com/datasets/fedesoriano/cirrhosis-prediction-dataset
- http://www.diva-portal.org/smash/get/diva2:769192/FULLTEXT01.pdf
- https://www.freepik.com/