Streamlit is installable on either Google Colab or a virtual environment on your local computer (Windows/MacOS/Linux).

Three tips to tweak your Streamlit app

Streamlit is handy to run with dashboards or machine learning models. This article focuses on 3 tips to improving your app experience on Streamlit:

  • Caching
  • Customized CSS
  • Dealing with large image pixels
Example of Streamlit App done by author

1. Streamlit caching

Cache, in a form of key-value store, is important when it comes to loading texts or images. Using Streamlit caching mechanism helps to improve speed performance and to suppress warnings in certain cases.

Snippets to add @st.cache decorator before a defined function:

@st.cache(suppress_st_warning=True, allow_output_mutation=True)

Suppress warnings via cache

Warnings will pop out when your code contains…

Authors (alphabetical order):

Chan Pei Shan, Irene Too

University of Malaya Postgraduates.

This is with refer to IEEE-CIS Fraud Detection I here and here .

Imbalanced class

When the distribution between fraud and non-fraud class is very imbalanced, we may get very poor model performances. Let’s have a look on how imbalanced data affects result.

Image from authors: Chart of Imbalanced class


rfmodel = RandomForestClassifier(n_estimators=200, random_state=0)
rfmodel =, y_train)rfpred = rfmodel.predict(x_test)classification_report(y_test,rfpred)print('randomforest accuracy: '+str(accuracy_score(y_test, rf_pred)))#naive bayes
nbmodel = GaussianNB()
nbmodel =, y_train)nbpred = nbmodel.predict(x_test)classification_report(y_test,nbpred)print('naive bayes accuracy: '+str(accuracy_score(y_test, nbpred)))
#Logistic regression
lrmodel = LogisticRegression()
lrmodel =, y_train)lrpred = lrmodel.predict(x_test)

Authors (alphabetical order):

Chan Pei Shan, Irene Too

University of Malaya Postgraduates.

*This is a continuation from IEEE-CIS Fraud Detection I.

Dimensionality of a dataset tells us the amount of attributes a dataset contains. A dataset with high dimension can be painful to deal with. In this article, we will give a walkthrough on reducing dimension of data by:

  • Variance and correlation analysis
  • Principal Component Analysis

What if your dataset has more than 100 features?

When you have a dataset with more than 100 features, there could be many unnecessary attributes, which can affect the models badly.

Here are few methods where you can reduce the dimensions:

Dropping low variance attributes

When a…

Authors (alphabetical order):

Chan Pei Shan, Irene Too

University of Malaya Postgraduates.

Introduction to IEEE-CIS Fraud Detection

The online payment fraud is still an issue today. While doing fraud detection, we might wonder:

How to detect both fraud and non-fraud transaction?

In our Part I, II, III articles, we will guide through data analysis and machine learning journey using dataset from IEEE-CIS Fraud Detection | Kaggle. Let’s go!

Import data

train_identity = pd.read_csv('train_identity.csv')train_transaction = pd.read_csv('train_transaction.csv')train = pd.merge(train_transaction, train_identity, on='TransactionID')

Exploring data

The IEEE dataset is composed by 590540 transactions, which have 392 transaction attributes and 40 identities attributes. There are many interesting categorical features such as card…

How translation and rotation happens with package ’hgtransform’

When Mathematics meets Matlab

Matlab is a useful tool for creating visualizations, and in this article I am going to go through how to build a 3D cylinder, and do transformation on it.

In mathematics, we learned transformation, e.g. rotation etc. Ever wondered how these looks like?

Create a colourful 3D cylinder

Matlab makes making geometry shapes simple with ready-made packages. Here, we will use ‘cylinder()’ Matlab function to make a cylinder.

[x,y,z] = cylinder(2, 100);
t1 = hgtransform;
s1 = surf(3*x,3*y,4*z,'Parent',t1);
grid on
shading interp

We will be creating a cylinder with x-axis of 6units, y-axis of 6 units, and z-axis of 4 units.

And here comes the cylinder:

Image from Unsplash .com

Apache Pig is a program that involves high-level language for examining huge volume of data. Here is a simple method to install it in aLinux Ubuntu environment.


  1. Ubuntu environment.

2. Installed and set-up Hadoop environments:

You should already be able to run single hadoop node:

Problem : What if my storage spaces are running out?

Often when we install a lot of software in our Ubuntu, the storage spaces will eventually run out, especially at the root partition.

When your device popped up a warning box, telling you your spaces are running out, and you have nothing much to delete, it is time to consider expanding partition spaces.

*This article is for educational purposes only

Hows the latest heart disease discovery updates?

Heart disease, such as coronary artery disease or heart failure, can be fatal.

“17.9 million people die each year from CVDs (cardiovascular disease), an estimated 31% of all deaths worldwide.” — — reported by World Health Organization

Infected Covid-19 patients are reported to have problematic heart conditions after that.

“ …More than two months later, infected patients were more likely to have troubling cardiac signs …” — mentioned by Stat News, according to Valentina O. Puntmann’s recent study.

What are the latest heart disease research news ? Let’s go through this tutorial.

Import packages


Irene Too

A passionate educator and a computer science hobbyist.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store