Machine Learning (ML)
ML Applications in Wireless Communication Networks
The project aims to detect user anomalies and inference patterns in wireless networks, targeting to improve the data rates of communication systems.
*The results of the tests were published at an international conference. The statistical method was adopted by a local startup company (GOHM) for vehicle localization and data.
Led a project developing physical layer authentication methods for wireless networks using device fingerprinting and machine learning.
Conducted tests on cellphones with LabVIEW and National Inst. devices; cleaned and prepared data using Python packages, e.g., matplotlib and pandas.
Statistical method was developed for data operations and feature extraction. Utilized Python, pandas, NumPy, and seaborn.
Classified cellphones with K-Nearest N. and implemented Support Vector M. in network device transceivers using TensorFlow, PyTorch, and scikit-learn.
Used spectrograms from Fast Fourier Transform (FFT) calculations via SciPy, NumPy, and matplotlib to input frequency, time, and amplitude data into a custom-designed CNN model, differentiating legitimate IoT sensors from attackers.
Achieved 99% accuracy on 8000 test images for identifying legitimate sensors, compared to 96% for attackers.
Published the statistical method which was later adopted by GOHM-Embedded Intelligence for vehicle localization and data protection.
ML Applications in Magnetic Domain Pattern Images
Benefit: An opportunity to create domains that have out-of-range parameters (you can’t execute in micromagnetic simulation).
In order to extract parameters from and/or generate new images of magnetization patterns using micromagnetic simulation data, the images of simulated magnetization patterns will be input into Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) for training.
*Worked on the latest technology - Heat Assisted Magnetic Recording (HAMR), the leading high-capacity drive technology.
Worked on magnetization pattern images generated using advanced storage technology (HAMR) with ML techniques.
Defined project scope and selected optimal Convolutional Neural Network (CNN) architectures for parameter extraction from magnetization images.
Cleaned, preprocessed data, and crafted a tailored CNN. Used PyTorch, pandas, and Principal Comp. Analysis for feature extraction from these images.
Developed magnetic domain images through custom GANs and tested DALL-E's adaptability for text-to-image/art generation in magnetic contexts.
Advanced Reinforcement Learning Agent for Pong Using TensorFlow and Keras
Keras implementation of a blog post (Deep Reinforcement Learning: Pong from Pixels) that originally used Python's numpy library for neural network operations.
Training process.
Implemented the project by integrating TensorFlow and Keras, showcasing the ability to adapt and modernize legacy code with current deep learning frameworks.
Constructed a neural network model using Keras with a hidden layer of 200 neurons, demonstrating skills in neural network architecture design.
Employed Policy Gradient methods for intelligent decision-making, reflecting an understanding of advanced reinforcement learning techniques.
Implemented custom preprocessing and a tailored loss function to optimize the agent's performance, displaying proficiency in data manipulation and algorithm customization.
Demonstrated the agent's gameplay in real-time through OpenAI Gym's environment, showcasing the practical application of machine learning in a dynamic setting.
Successfully trained, saved, and reloaded the model for demonstration purposes, illustrating the project's adaptability and reproducibility.
GitHub code can be found here: https://github.com/zkhodzhaev/reinforcement_learning
During testing, the trained model was executed using pure Python in Jupyter Notebook.
Representation Learning using Multi-Layer Perceptrons (MLPs)
The study utilized multi-layer perceptrons (MLPs) for their versatility with diverse data types and relationships. MLPs with 50 and 100 nodes in hidden layers were chosen to balance complexity and efficiency, considering the feature sizes of datasets like the German Credit Dataset (27 features), Credit Defaulter Dataset (13 features), and Bail Dataset (18 features). This aimed to capture patterns effectively in datasets with 1,000 to 30,000 entries and prevent overfitting in smaller datasets. More details about datasets can be found here.
MLPs were configured with various hidden layer sizes and activation functions, including 'identity', 'logistic', 'tanh', and 'relu', ranging from simple to complex structures. Hyperparameters like learning rate and batch size added complexity and affected training dynamics and generalization capabilities. The study used default settings for these parameters to minimize variability.
Dimensionality reduction techniques like PCA and t-SNE were implemented to manage high-dimensional data. PCA transformed features into principal components, preserving essential information, while t-SNE retained local data structures in reduced dimensions. PCA was iteratively applied to explore the impact of dimensionality reduction on model performance. t-SNE, limited to three components for efficiency, was used to assess its effect on MLP performance.
The study showed variations in MLP performance across datasets. For instance, an MLP with a single 100-node layer achieved an AUROC of 76.02% and an F1-score of 80.28% on the German Credit Dataset. Fairness metrics like SP and EO varied, indicating potential predictive biases. Training MLPs on data with altered sensitive attributes slightly improved fairness metrics but reduced performance, emphasizing the need for techniques balancing fairness and predictive accuracy.
Overall, the research highlighted the importance of careful model configuration and appropriate dimensionality reduction in ensuring fairness and stability in representation learning.
GitHub code can be found here: https://github.com/zkhodzhaev/representation_learning/
Here's a summary of the MLP results as compared to other methods:
* Results obtained after randomly flipping sensitive attributes to augment the data.