I connect the dots.
Data Science, Machine Learning, & AI
Data Science, Machine Learning, & AI
Hey there,
My name is Juan and I am a data scientist, certified chemical engineer, and also have a well developed background of sales and account management.
Based in Brisbane, Australia, I'm skilled in using Python, PowerBi, building and retraining neural networks, implementing natural language processing algorithms for text and image analysis, and have knowledge of SQL.
Developing predictive models using Python to solve complex business problems
Leveraging SQL and Excel to extract, transform and load data from various sources
Creating interactive dashboards using Power BI for business stakeholders
Builing visualizations using matplotlib, seaborn, and other data visualization libraries
Training and testing models using various supervised and unsupervised machine learning algorithms
Developing and implementing Natural Language Processing(NLP) solutions for text classification and sentiment analysis
&
Technologies used: Neural Networks
I created a predictive model using neural networks, specifically the TensorFlow-Keras framework, to forecast customer churn. To ensure accurate predictions, I first had to clean and prepare the data in various ways using python libraries such as Numpy and Pandas. Once I had my data ready, I tested the model and was able to achieve an 80% accuracy rate. This means that my model can correctly predict customer churn in 80% of cases, which is a significant accomplishment for businesses looking to reduce their churn rates. By using machine learning techniques, we can create valuable tools for companies aiming to improve their customer retention.
Technologies used: Natural Language Processing
I have developed several models using different datasets to classify and predict information. Some of the models I have created include news classifiers, sentiment detection from Twitter databases, book genre classifiers, and more. To build these models, I have utilized open-source libraries like Spacy and Fasttext, as well as custom-made models with Countvect and TfidfV. Additionally, I have used supervised algorithms such as multinomialNB, KNN, and LinearSVC, depending on the data available. These models could provide valuable insights for businesses and researchers trying to gain a better understanding of their data. Through working with various datasets and algorithms, I have gained a comprehensive understanding of the different approaches to machine learning and how they could be applied in different industries.
Technologies used: Neural Networks
I retrained a Google model that was initially created to classify different animals and modified it to identify different species of whales. Specifically, I trained the model to recognise the various whale species, even when only certain parts of the animal were visible in the image. Through this process, I was able to achieve an impressive 83% accuracy rate. By leveraging machine learning techniques, I was able to create a valuable tool that could be used for researchers and conservationists seeking to better understand and protect whale species.
Technologies used: Machine Learning
I have created a machine learning model that can predict employee attrition with a high degree of accuracy. To develop this model, I utilised two supervised algorithms, namely Logistic Regression and MLP Classifier, and through extensive testing, I was able to achieve an 89% accuracy rate. This model could be a valuable tool for businesses looking to retain their employees and improve their retention rates. Before feeding the data into the model, I cleaned and prepared it using the Pandas library in Python, which allowed me to remove any potential outliers or anomalies in the data. Overall, this project has been an exciting opportunity to leverage machine learning and data analysis to solve real-world problems and create positive change within organisations.
Technologies used: Neural Networks
I've developed a machine learning model using Convolutional Neural Networks (CNN) to classify images of potato leaves and detect any signs of blight infection. Through extensive testing, I was able to achieve a 98% accuracy rate, which demonstrates the efficacy of the model. To prepare the images before they were fed into the model, I used a range of Python libraries such as NumPy and Pandas, which allowed me to preprocess the data and ensure that it was ready for analysis. The model I created could be a valuable tool for farmers and researchers seeking to identify blight infections in potato crops and take action to mitigate the issue. Overall, this project has been an exciting opportunity to apply machine learning to real-world problems and create meaningful solutions.
Technologies used: PowerBI
I have created several dashboards using publicly available data from car insurance and sales data. By using data visualisation techniques, I was able to gain valuable insights from the data, including identifying trends and patterns that may not have been readily apparent otherwise. To prepare the data, I used various tools, including the Pandas library in Python and MySQL, which allowed me to clean, organise, and analyse the data more effectively. These dashboards provide valuable insights for businesses and researchers seeking to better understand their data, and they are a powerful tool for decision-making and strategic planning. Overall, this project has been an exciting opportunity to leverage data analysis and visualisation techniques to create meaningful solutions that provide tangible value for businesses.