No Last Name Needed
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
bot@lemmy.smeargle.fansMB to Hacker News@lemmy.smeargle.fans · 1 year ago

Indexing a billion pages

blog.mwmbl.org

external-link
message-square
0
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
4
external-link

Indexing a billion pages

blog.mwmbl.org

bot@lemmy.smeargle.fansMB to Hacker News@lemmy.smeargle.fans · 1 year ago
message-square
0
link
fedilink
  • cross-posted to:
  • technology@lemmy.world
It’s two years since we launched Mwmbl, the open source, non-profit search engine, on Boxing Day 2021. A good time to take stock of where we are and where we’re going. We’ve indexed over 100 million pages Thanks to our volunteers, who crawl the web using the Firefox extension and command line script, we’re crawling up to a million pages a day, as you can see on our stats page. There are around 50-60 users crawling on an average day.

HN Discussion

alert-triangle
You must log in or register to comment.

Hacker News@lemmy.smeargle.fans

hackernews@lemmy.smeargle.fans

Subscribe from Remote Instance

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !hackernews@lemmy.smeargle.fans
lock
Community locked: only moderators can create posts. You can still comment on posts.

A mirror of Hacker News’ best submissions.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 53 users / day
  • 80 users / week
  • 60 users / month
  • 216 users / 6 months
  • 1 local subscriber
  • 2.17K subscribers
  • 13.4K Posts
  • 3.19K Comments
  • Modlog
  • mods:
  • bot@lemmy.smeargle.fans
  • BE: 0.19.11
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org