mycareers logo


Showing: 32 Lead Response Specialist/Lead Engagement Manager jobs in Atlanta, Georgia
Operational lead/coordinator
Spectraforce
Cambridge, Massachusetts

a month ago

Job Description

Job Title: Operational lead/coordinator
Job Location: Cambridge, MA (02139)
Duration: 12 months

Job description:
Operational lead/coordinator for BR compute operational management.

Acting as the lead/coordinator for the BR computes including NVIDIA DGX for large AI model operations, including but not limited to:
  • Provide users with access to platform ensuring completion of mandatory training
  • Scheduling deployments
  • Product Operations, including adherence to security compliance practices and documentations.
  • Platform usage and reporting, through collaboration with corporate IT ,to determine metrics to be implemented, and reporting back to governance board
  • Platform operations and monitoring, managing the user facing communications for issues and working in collaboration with internal and external partners on the resolution
  • Support users on AI model operations should they face issues, through support to navigate IT landscape services or facilitating support from vendor platform provider
  • Support environment setup/cleanup, ensuring user adherence to project model and data off-boarding according to expectations for platform usage
  • Create and publicize platform related training and keep materials updated

To support the activities the following skills are required:
  • Leadership & problem solving: Co-lead the operationalization of the environment, collaborating to establish SOP's & guidelines, navigating ambiguity, and adapting to evolving systems
  • Technical Knowledge: Familiarity with interfacing and services with data warehouses. Proficiency in Docker, Kubernetes, and SSH to assist users with container setup, port forwarding, and interactive access.
  • Strong knowledge of cloud platforms, with preference for NVIDIA (DGX) & scheduling tools, including RunAI.
  • Resource Management: Ability to monitor and manage GPU and storage resources, ensuring efficient usage and addressing any underutilization.
  • Data Management: Familiarity with Data Warehousing; perform data upload and cleanup on the computing platform.
  • User Support and Training: Experience in providing technical training and support, particularly in using Docker and SSH, to help users manage their code and data independently.
  • Coordination and Documentation: Skills in creating detailed documentation and knowledge base articles, and coordinating with DDIT to streamline the onboarding process.
  • Operational tasks: Capability to handle technical operations tasks such as deleting containers and images from the registry and assigning resources on the cluster.
Education & Experience:
  • 5 years experience, given the ambiguity of getting a new platform off the ground, and a educational background of BA/BS in a technical field (or scientific with significant technical experience).
 

Applicant Notices & Disclaimers
  • For information on benefits, equal opportunity employment, and location-specific applicant notices, click here
 


At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position’s starting pay is: $67.89/hr.

Don't miss your next Big Opportunity!

Get notified when we find an opportunity for you