Overview

Don’t apply if

  • You rarely (or never) change your mind on things
  • You speak more than you listen
  • You get defensive about your ideas
  • You have a hammer and you always use it (i.e. you try to solve every problem with the skillset that you feel strong in)
  • You always need to get the spotlight

Do apply if

  • You listen more than you speak
  • You think first what’s the best possible solution to the problem and then walk backwards to what’s possible with the resources you have
  • You are happy to drop your idea if circumstances have changed and it’s no longer the best solution
  • You use precise language and you insist that others do to
  • You state clearly what you expect from others and what they can expect from you
  • You derive satisfaction from knowing that you have done a good job; even if nobody else noticed
  • You can communicate clearly using written documentation
  • You are fan of systems thinking and you use it often

Why – our Mission

We are working on decentralizing information and credibility distribution.

Society suffers when small groups control what others get to know and believe. Be it through controlling the broadcasting (tv, radio, newspapers, access to social media platforms) or credibility signaling (who should be trusted and who shouldn’t).

What We’re Working On

We are building an influence algorithm. In other words, we are trying to find ways to describe groups of people mathematically. Many tried and failed before. But we think we can make it work.

Our core hypothesis is that influence can be quantified by tracking attention flows. In order to do that, we ingest data streams from multiple sources (we started with Twitter and are now indexing podcasts and soon more). We then cross-reference these datasets in an attempt to continuously improve the accuracy.

The accuracy of our work is being verified by members of the groups that we aim to describe. We publish our results in real-time and there are thousands of people already using our scores. It is hard to verify when we are right. But it is very easy to tell when we are wrong. This short feedback loop puts us in a unique position to work on problems that might be much harder or impossible to solve somewhere else.

Work setup

We are a small, VC-funded startup. We are a remote-first team. Most of the team is based in Europe (Berlin, London, Barcelona). You can make your own hours, but everybody is expected to be online during office hours in CET. We try to meet in person and work together for several days at least every 3 months. Other than that the company ‘lives’ in Slack, Notion and other tools enabling effective communication.

The job is full time permanent. If you work from Germany, you will need a German work permit and you will get an employment contract, if you work from somewhere else in the world, you will work as a freelancer, and need to fulfill the legal requirements for freelancing in your country.

If you work from Berlin, you can work from our Berlin office, located in Mitte.

About this role

You will employ creative solutions to problems working across Algorithm, Platform and Product teams. You will ensure that we are deploying optimal solutions to the most pressing problems.

Your ultimate goal is to ensure that the Algorithm team receives the best possible data set. You will use this objective to define and prioritize your own tasks. This does not mean that you will always work directly on obtaining the dataset required by the Algo team. In some cases, you may decide that the best way to help the Algo team to achieve their objective is to first provide another data stream to the Product team. This, in turn, will result in obtaining user-generated data that enables you to refine the data stream initially requested by the Algo team.

You will be responsible for designing the data structure. You will have to account not only for current challenges, but also anticipate what data we will be collecting in the years to come. You will make sure that our architecture can handle all kinds of data that we are looking to index.

This, of course, means that you have to have a deep understanding of what the data is going to be used for. You need to be able to evaluate what level of accuracy is required and what can be achieved with various approaches and use this information to decide which strategy to pursue.

You will think about these problem in a very broad context of the whole organization. You need to understand dependencies of each team, data provider and technical limitations. This role requires a combination of both excellent people & technical skills.

You will own the whole lifecycle of collecting, cleaning and refining our data.

Collecting – Cleaning – Refining

You will work with multiple APIs, RSS feeds, scrapers and any other way that you will find fit to collect relevant data. This may also include designing processes and incentives for users to willingly provide data directly to us.

You will be responsible both for identifying the best source of a given data and the technical execution. You will make sure that all of our processes are legal and ethical and result in high quality data while fulfilling these criteria.

You will employ various techniques in order to make sure that the data is as clean and correct as possible. You will have flexibility to explore various technical approaches, but also experiment with system design and/or social design solutions.

For example, your tasks may include:

  • Indexing Twitter accounts and building and maintaining a classifier that labels them as Personal, Brand or Bot
  • Matching content (e.g. tweets, reddit posts) with podcast episodes or live events that they are discussing
  • Matching corresponding podcast episodes across multiple podcasting platforms and YouTube

We are in a unique position where we can leverage both multiple streams of data and user input. You will be responsible for making sure we are taking full advantage of this. For example, it is likely to be easier to perform 90% of the labelling using an algorithmic solution. But the remaining 10% is very hard to get right through automation, but it’s trivial to do for the user. It’s also a manageable size of data set to be labelled by a human. In this case, you would need to:

  1. Come up with, design and deploy the automated solution (e.g. a Machine Learning model)
  2. Make sure that there are incentives in place for the users so that they will perform the desired labelling and that it’s going to be of high quality
  3. Design and perform a test to validate that the users do perform the work as intended
  4. Deploy metrics to track performance of the combined solution (algorithmic + social) and it’s sub-components.

Requirements

  • Python proficiency
  • Deep interest and broad familiarity with Machine Learning algorithms
  • Past experience working with APIs and web scraping
  • Extensive experience with relational and non-relational databases
  • Past experience setting up and optimizing classifiers. Proven track record of improved performance
  • Good communication & writing skills

Great to have

  • Interest and familiarity with latest developments in Deep Learning and general AI

Compensation

EUR 50,000 – 70,000 per year

Equity 0.14 – 0.18 %

Browse more remote jobs here >>> euremotejobs.com/jobs/