Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Avoid "looking for job" toots from appearing amongst "vacancies". #1

Open
berkes opened this issue Jun 12, 2021 · 0 comments
Open

Avoid "looking for job" toots from appearing amongst "vacancies". #1

berkes opened this issue Jun 12, 2021 · 0 comments

Comments

@berkes
Copy link
Contributor

berkes commented Jun 12, 2021

Intent

  • Distinguish posts from people that advertise they are looking for a job from people advertising a job opening.
  • Avoid "I'm looking for a job" from appearing amongst the "job openings".

Problem

The classifier for vacancies is extremely limited and naive. It just looks at tags.

What we need, however, is a way to avoid people who post that they "are looking for a #job" from showing up amongst the toots that offer a job.

Discussion

People looking for a job use very similar language, tags and structure as those posting job openings. Toots, being short, don't help a lot either to contextualize this.

For example, a vacancy uses the phrase "are you looking for a #job? we have a #vacancy". And a jobhunter uses a phrase "I am looking for a #job. Do you have a #vacancy?". The important words and the tags are exactly similar. Merely some minor structural difference makes one a job-opening and the other a job-seeker.

Options

  • Use more dedicated tags. Avoid generic tags such as #job or #vacancy and only index tags such as #flockingbird.
  • Only index toots when the bot is explicitly requested, like we do with the "index me" request. E.g. an "@hunter2, index this" as reply to a toot, would index that replied-to toot; similar to how the various thread-unroller-bots on twitter work.
  • Use NLP or even Machine learning to classifiy toots. Binary Text Classification, e.g. with BERT may work.
  • Don't fix the problem at all: we currently see that the amount of job-seekers amongst job-openings is a minority. Maybe we can live with this. As long as we ensure that people who don't want to be indexed are never indexed.

Example

afbeelding

Check list

  • Manual inspection of a few popular search phrases such as "linux, OpenSource, and FLOSS" should show no job-hunters amongs the job-openings.
  • False negatives, apparent job openings, should appear on the index still. Manual cross reference with e.g. fosstodon or another popular fediverse instance for the popular tags (see above) should still show obvious advertisements of jop-openings show up.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant