Software Engineer, Data Acquisition
Vor 6 Tagen
About Mistral
At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.
We democratize AI through high-performance, optimized, open-source, and cutting-edge models, products, and solutions. Our comprehensive AI platform is designed to meet enterprise needs, whether on-premises or in cloud environments. Our offerings include le Chat, the AI assistant for life and work.
We are a dynamic, collaborative team passionate about AI and its potential to transform society.
Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, the USA, the UK, Germany, and Singapore. We are creative, low-ego, and team-spirited.
Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on
Role Summary
We are looking for a skilled and motivated Web Crawling and Data Indexing Engineer to join our dynamic engineering team. The ideal candidate should have a solid background in web scraping, data extraction, and indexing, with experience using advanced tools and technologies to collect and process large-scale data from diverse web sources.
What you will do
As a Software Engineer in the Data Acquisition team, you will be responsible for:
- Develop and maintain web crawlers using Python libraries such as Beautiful Soup to extract data from target websites
- Utilize headless browsing techniques, such as Chrome DevTools, to automate and optimize data collection processes
- Collaborate with cross-functional teams to identify, scrape, and integrate data from APIs to support business objectives
- Create and implement efficient parsing patterns using regular expressions, XPaths, and CSS selectors to ensure accurate data extraction
- Design and manage distributed job queues using technologies such as Redis, Kubernetes, and Postgres to handle large-scale data processing tasks
- Develop strategies to monitor and ensure data quality, accuracy, and integrity throughout the crawling and indexing process
- Continuously improve and optimize existing web crawling infrastructure to maximize efficiency and adapt to new challenges
About You
- Core Programming & Web Technologies
- Proficiency in Python, Java, or C++
- Strong understanding of HTTP/HTTPS protocols and web communication
- Knowledge of HTML, CSS, and JavaScript for parsing and navigating web content
- Data Structures & Algorithms
- Mastery of queues, stacks, hash maps, and other data structures for efficient data handling
- Ability to design and optimize algorithms for large-scale web crawling
- Web Scraping & Data Acquisition
- Hands-on experience with web scraping libraries/frameworks (e.g., Scrapy, BeautifulSoup, Selenium, Playwright)
- Understanding of how search engines work and best practices for web crawling optimization
- Databases & Data Storage
- Experience with SQL and/or NoSQL databases (e.g., PostgreSQL, MongoDB) for storing and managing crawled data
- Familiarity with data warehousing and scalable storage solutions
- Distributed Systems & Big Data
- Knowledge of distributed systems (e.g., Hadoop, Spark) for processing large datasets
- Data Analysis & Visualization
- Proficiency in Pandas, NumPy, and Matplotlib for analyzing and visualizing scraped data
- Bonus Skills (Nice-to-Have)
- Experience applying Machine Learning to improve crawling efficiency or accuracy
- Familiarity with cloud platforms (AWS, GCP) and containerization (Docker) for deployment
Hiring Process
Here is what you should expect:
- Introduction call - 35 min
- Hiring Manager Interview - 30 min
- Live-coding Interview - 45 min
- System Design Interview - 45 min
- Culture-fit discussion - 30 min
- Reference checks
Location & Remote
This role is primarily based at one of our European offices (Paris, France and London, UK). We will prioritize candidates who either reside in Paris or are open to relocating. We strongly believe in the value of in-person collaboration to foster strong relationships and seamless communication within our team.
In certain specific situations, we will also consider remote candidates based in one of the countries listed in this job posting — currently France, UK, Germany, Belgium, Netherlands, Spain and Italy. In that case, we ask all new hires to visit our Paris office:
- for the first week of their onboarding (accommodation and travelling covered)
- then at least 3 days per month
What we offer
Competitive salary and equity
Health insurance
Transportation allowance
Sport allowance
Meal vouchers
Private pension plan
Parental : Generous parental leave policy
Visa sponsorship
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
-
Data Engineer
vor 1 Woche
Zürich, Zürich, Schweiz Senior Data EngineerWaterstons Ltd Vollzeit CHF 120'000 - CHF 180'000 pro JahrData Engineer Winton Charing Cross, United Kingdom month ago Role details Contract type Permanent contract Employment type Full-time (> 32 hours Working hours Regular working hours Languages English Experience level Junior Compensation £ 69K Job location Charing Cross, United Kingdom Tech stack Airflow Algorithmic Trading Amazon Web Services (AWS Data...
-
Software Engineer
Vor 4 Tagen
Zürich, Zürich, Schweiz Qube Research & Technologies Vollzeit CHF 70'000 - CHF 90'000 pro JahrQube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology and data driven group implementing a scientific approach to investing. Combining data, research, technology and trading expertise has shaped QRT's collaborative mindset which enables us to...
-
Data Engineer
Vor 6 Tagen
Zürich, Zürich, Schweiz SIGMA7 GmbH VollzeitData Engineer / Machine Learning Engineer (w/m/d) - befristet 19 Tage alt Angaben zum Job Firma Job-Inhalt Unser Team trägt messbar zum Erfolg der Migros-Gruppe bei, indem wir innovative Lösungen für Pricing und Aktionen im Detailhandel entwickeln. Mit massgeschneiderten, daten- und ML-getriebenen, Cloud-basierten Applikationen adressieren wir nicht nur...
-
Senior Software Engineer
vor 1 Woche
Zürich, Zürich, Schweiz Rocken AG Vollzeit CHF 130'000 pro JahrSenior Software Engineer Rocken AG Zürich, Switzerland days ago Role details Contract type Permanent contract Employment type Full-time (> 32 hours Working hours Regular working hours Languages English, German Experience level Senior Compensation CHF 130K Job location Zürich, Switzerland Tech stack Artificial Intelligence Cloud Computing Software Quality...
-
Software Engineer
Vor 4 Tagen
Zürich, Zürich, Schweiz Glocomms Vollzeit CHF 60'000 - CHF 180'000 pro JahrWe are currently looking for a Software Engineer, to join our client, a global leader in the Technology field, based in Zurich.The role will be a 1 year contract, which can be extended.As a software engineer, you'll be driving efforts to develop and prototype computer-vision based experiences that provide creativity and social play experiences for people...
-
DevOps Engineer
Vor 2 Tagen
Zürich, Zürich, Schweiz Data Gmbh Vollzeit CHF 120'000 - CHF 200'000 pro JahrDevOps Engineer - Data Science Plattformen ITech Consult AG Zürich, Switzerland days ago Role details Contract type Permanent contract Employment type Full-time (> 32 hours Working hours Regular working hours Languages English, German Compensation CHF 208K Job location Zürich, Switzerland Tech stack Agile Methodologies Artificial Intelligence Azure...
-
Junior Software Engineer 100
Vor 4 Tagen
Zürich, Zürich, Schweiz Code Compass 🧭 Vollzeit CHF 60'000 - CHF 110'000 pro JahrJunior Software Engineer 100% - GolangJoin our client, a cutting-edge technology firm based in Switzerland, developing next-generation blockchain infrastructure. Their mission is to build ultra-reliable systems that power real-time decision-making in global markets.Go, C#, Redis, NATS, JetStream, SQL, APIs, DevOps, Microservices, Cloud Infrastructure, Data...
-
Computational Neuroscientist
vor 2 Wochen
Zürich, Zürich, Schweiz MaxWell Biosystems Vollzeit CHF 120'000 - CHF 180'000 pro JahrAt MaxWell Biosystems, we innovate the future of electrophysiology by developing cutting-edge technologies for neural activity recording. Our platform includes the software MaxLab Live, a custom-designed integrated circuit, and an FPGA-based data acquisition system that generates vast amounts of data when electrically imaging neuronal tissue. These data are...
-
Data Engineer
vor 6 Stunden
Zürich, Zürich, Schweiz Swiss Life Investment Management Holding AG Vollzeit CHF 100'000 - CHF 120'000 pro JahrData Engineer (f/m/d) Full time At Swiss Life Asset Managers you will contribute your own talent and expertise in a motivated and flexible working environment. You will take on a high degree of responsibility and master demanding challenges independently, with the scope to express yourself and in cooperation with professional teams. Working at Swiss...
-
Data Engineer
vor 1 Woche
Zürich, Zürich, Schweiz Trovian Vollzeit CHF 80'000 - CHF 120'000 pro JahrRole DescriptionThe Data Engineer is responsible for designing, building, and maintaining scalable data infrastructure and pipelines that enable efficient data collection, storage, and analysis. This role focuses on transforming raw data into well-structured and accessible datasets for business intelligence, analytics, and machine learning applications. The...