Must live a commutable close distance to Baltimore Maryland to even be considered and be on-site five days a week in Woodlawn, MD
NOT CONSIDERING RELOCATIONS
Key Required Skills:
Strong knowledge of AI/ML/LLM, Python, NLP, ****Generative AI and experience in the clinical domain****
Position Description:
• Stay updated on the new methods in NLP, ML and Generative AI
• Understand real world challenges and develop automated data solutions
• Develop, test, and deploy new techniques for NLP understanding
• Scalable development/deployment of ML and Generative AI approaches (such as Large Language Models (LLMs)
• Train and optimize NLP/LLM models and create Python based pipelines
• Determine the nature of analytic problems, evaluate options, and offer recommendations for resolution.
• Advise on the methods and data needed and/or available to evaluate the (intelligence or data) problem.
• Collaborate with data collectors and analysts to identify and close gaps on complex monitoring problems.
• Provide accurate, timely, complex, and sophisticated data analysis.
Skills Requirements:
• Bachelor's degree in Statistics, Applied Mathematics, Computer Science, or Information Science with industry experience on NLP, data science, AI/ML/LLM engineering.
• Minimum 8 Year (s) of Data Scientist experience
• Must be able to obtain and maintain a Public Trust. Contract requirement.
*** Selected candidate must be willing to work on-site in Woodlawn, MD 5 days a week.
Required Skills)
These skills will help you succeed in this position:
• Experience with Natural Language Processing (NLP), Generative AI and Large Language Models (LLM)
• Fluency in Python Programming, version control and collaboration with GIT, standard Python packages (ex. Pandas, numpy, matplotlib) and ML frameworks
• Knowledge of TensorFlow, PyTorch, Pandas, scikit-learn, NLTK, Azure ML (optional), Amazon Web Services EC2.
• Experience with scalable data engineering frameworks such as Apache Spark and orchestration frameworks such as Airflow, and/or experience with semantic search.
• Expert knowledge in conducting data analysis and applying advanced statistical concepts and ML methods to build, train, test, and evaluate a variety of supervised and unsupervised analytic models.
• Experience with ML model deployment and operations like DevOps, MLOps, LLMOps.
• Experience with NLP and Generative AI libraries like regular expressions (e.g., spacy, langchain), text annotation tools and semantic frameworks.
• Ability to clean and process large amounts of real-world data.
• Experience retrieving and manipulating data from a variety of data sources included DB2, Oracle, SQL Server, Hadoop and flat files.
• Experience with database management systems (e.g., PostgresSQL, MySQL, SQLite, SQL, etc.)
• Excellent analytical skills to identify potential risks and propose effective solutions.
• Excellent problem-solving skills, ability to collaborate with cross-functional teams and proven communication in written and verbal formats to various audiences to include executive leadership.
(Desired Skills)
• Prior experience working on applications in the clinical domain.
• Prior experience with federal or state governments IT projects.
• Experience with, or the ability and willingness to learn distributed processing via the Hadoop ecosystem, i.e., Spark, Impala and Hive.
• Experience working in an analytical research environment.
• Experience in parallel processing such as GPU programming with CUDA
• Experience with Mathematica
• Experience using markup languages such as LaTeX, HTML, etc.
• Experience with Natural Language Processing for anomaly detection
Education:
• Bachelor's degree with 12+ years of experience
• Must be able to obtain and maintain a Public Trust. Contract requirement.