Computer Vision Research Internship: Image to Sequence Modeling

vor 1 Woche


Zürich, Zürich, Schweiz Scandit Vollzeit CHF 50'000 - CHF 60'000 pro Jahr

Duration: Minimum 6 months; ideally 9–12 months, depending on the candidate's experience

Scandit gives people superpowers. Whether enabling delivery drivers to make quicker deliveries, matching a patient with their medication, or allowing retailers to make store operations more efficient, our technology automates workflows and provides actionable insights to help businesses in a variety of industries. Join us, as we continue to expand, grow and innovate, and help take Scandit to the next level.

About the Internship

We are offering a research-focused internship aimed at advancing machine learning methods for complex visual understanding tasks. The project centers on deep learning architectures for image-to-sequence modelling, such as Transformers, attention mechanisms, and modern sequence and representation-learning frameworks, to address challenging and highly structured computer vision problems. This project contributes to long-term research efforts aimed at achieving even higher performance, robustness, and generalization in large-scale visual applications.

What you will do

You will work closely with experienced ML researchers and engineers on cutting-edge research at the intersection of computer vision and sequence modeling. Your work will include:

  • Designing and experimenting with new ML architectures for structured visual data.
  • Evaluating alternative modeling paradigms (e.g., encoder–decoder, hybrid Transformer models, sequence-based representations).
  • Investigating techniques for improving robustness, generalization, and multi-view reasoning.
  • Running systematic experiments, ablations, and error analyses to validate research hypotheses.

This project provides opportunities for novel model design, extensive experimentation, and scholarly research. You will contribute to long-term innovation in our technology, with potential real-world impact for millions of users. An ideal position for experienced master's students, PhD collaborations, or candidates preparing for a research career in industry or academia.

Who you are

MSc or PhD student in Computer Science, Machine Learning, Artificial Intelligence, or a related field with a strong research focus. Candidates should have a solid foundation in machine learning theory, neural networks, and computer vision.

Essential Skills:

  • Proficiency in Python and deep learning frameworks such as PyTorch.
  • Practical experience designing, training, and evaluating neural networks, including CNNs and Transformer-based architectures.
  • Strong analytical and problem-solving abilities, with the capability to interpret experimental results and iterate effectively.
  • Familiarity with research best practices, including reproducibility, controlled experiments, and ablation studies.

Desirable Skills:

  • Prior research experience in computer vision, pattern recognition, sequence modeling, or image-to-sequence architectures.
  • Experience training large-scale models or working with foundation-style architectures.
  • Contributions to publications, preprints, or open-source machine learning projects.

Strong communication skills and the ability to work independently in a research-oriented environment.

What We Offer
  • We are certified as a "Great Place to Work" in 10 countries
  • A highly skilled team and a fun environment where you can put your enthusiasm for computer vision challenges and cutting-edge technologies to use
  • Hackathons, summer parties, company outings and other regular events
  • Office in the city center of Zurich
Who We Are

Could your code give superpowers? Whether enabling delivery drivers to make quicker deliveries, matching a patient with their medication or allowing retailers to make store operations more efficient, our technology automates workflows and provides actionable insights to help businesses in a variety of industries. This means we have no shortage of technical challenges for engineers like you. Join us, as we continue to expand, grow and innovate, and help take Scandit to the next level.

"Everybody is welcome here" - Is a celebrated component of our DNA.

At Scandit we strive to create an inclusive environment that empowers our employees. We believe that our products and services benefit from our diverse backgrounds and experiences and are proud to be a safe space for all.

All qualified applications will receive consideration for employment without regard to race, colour, nationality, religion, sexual orientation, gender, gender identity, age, physical [dis]ability or length of time spent unemployed.



  • Zürich, Zürich, Schweiz Scandit AG Vollzeit CHF 50'000 - CHF 70'000 pro Jahr

    Computer Vision Research Internship: Image to Sequence Modeling (e.g. Transformers) Zurich Duration: Minimum 6 months; ideally 9–12 months, depending on the candidate's experience Scandit gives people superpowers. Whether enabling delivery drivers to make quicker deliveries, matching a patient with their medication, or allowing retailers to make...


  • Zürich, Zürich, Schweiz Apple Inc Vollzeit CHF 60'000 - CHF 120'000 pro Jahr

    Summary Posted: Oct 23, Role Number: We're seeking research interns to create breakthrough innovations in machine learning, computer vision, computer graphics, and related areas. We are particularly interested in 3D reconstruction, depth estimation, neural rendering, view synthesis, and image and video synthesis. You will work directly with an organization...


  • Zürich, Zürich, Schweiz Apple Vollzeit CHF 60'000 - CHF 120'000 pro Jahr

    We're seeking research interns to create breakthrough innovations in machine learning, computer vision, computer graphics, and related areas. We are particularly interested in 3D reconstruction, depth estimation, neural rendering, view synthesis, and image and video synthesis. You will work directly with an organization of world-class machine learning...


  • Zürich, Zürich, Schweiz ALPINA+SANA Vollzeit CHF 80'000 - CHF 120'000 pro Jahr

    Are you passionate about the intersection of cutting-edge research and healthcare? Do you want to contribute to the development of state-of-the-art solutions that will transform patient care in hospitals? Join our startup and play a key role in our 3D Nutrition Tracker project, leveraging the latest advances in computer vision and deep learning. This project...


  • Zürich, Zürich, Schweiz Apple Vollzeit CHF 60'000 - CHF 120'000 pro Jahr

    At Apple, we advance the state of the art to improve the lives of our customers worldwide. The European Vision Group (EVG) is dedicated to fundamental and applied research in computer vision and machine learning, with a focus on enriching human communication.Our team has delivered groundbreaking human-centric vision technologies used by millions worldwide,...


  • Zürich, Zürich, Schweiz Apple Inc Vollzeit CHF 60'000 - CHF 120'000 pro Jahr

    Summary Posted: Oct 24, Role Number: At Apple, we advance the state of the art to improve the lives of our customers worldwide. The European Vision Group (EVG) is dedicated to fundamental and applied research in computer vision and machine learning, with a focus on enriching human communication. Our team has delivered groundbreaking human-centric vision...


  • Zürich, Zürich, Schweiz Daedalean AG Vollzeit CHF 80'000 - CHF 120'000 pro Jahr

    About us: Daedalean is a Zürich-based startup founded by experienced engineers who want to completely revolutionize air travel within the next decade. We combine computer vision, deep learning, and robotics to develop full "level-5" autonomy for flying vehicles. Your role: To improve the algorithms and models of Daedalean's existing range of products in...


  • Zürich, Zürich, Schweiz Meta Vollzeit CHF 80'000 - CHF 120'000 pro Jahr

    The Reality Labs (RL) Research team brings together a world-class team of researchers, developers, and engineers to create the future of AR and VR. We are developing technologies to enable breakthrough AR glasses and VR headsets, working at the intersection of Computer Vision, Computer Graphics, and Machine Learning. The Virtual Humans team in Zurich, CH is...


  • Zürich, Zürich, Schweiz Meta Vollzeit

    The Reality Labs team at Meta is looking for Computer Vision Researchers to support our Spatial AI team as we build towards our goal to transform the way people come together to interact, work and play.We explore, develop and deliver cutting-edge technologies that serve as the foundations for the current and future Reality Labs products including VR, AR,...


  • Zürich, Zürich, Schweiz Meta Vollzeit CHF 60'000 - CHF 120'000 pro Jahr

    The Reality Labs Research (RL-R) Team in Zurich, CH is looking for research interns to tackle the most ambitious problems in the creation of digital avatars. In this internship, we aim to build a neural renderer that delivers high-quality avatar appearance under a wide range of body motions and face expressions. We plan to leverage pre-trained generative...