there is currently a bot inside MIT IP space, address 18[.]4[.]38[.]176
, scanning fedi at large. i have confirmed this with 5+ unrelated instance admins, large and small instances, across mastodon/misskey/pleroma/akkoma.
the bot is poorly behaved. i have observed it making repeated requests, multiple times per second, for the exact same paths (the paths being, generally: user profiles, specific posts, and sometimes following links in posts). returning 403s does not stop this activity. one of my domains received hundreds of additional requests despite replying with 403 to all of them. i have also seen it make requests for paths containing html tags - seems like a badly written parser. the purpose of these requests and what data is being gathered is unclear.
PTR on the ip returns sts-drand03.mit.edu
. a quick web search for "mit drand" brings back https://mitsloan.mit.edu/faculty/directory/david-g-rand and his personal website: https://davidrand-cooperation.com/ (note: other IPs in the /24 also have names in the PTR which match up with names of MIT faculty, but only the .176 IP appears to be involved in this activity).
seems he's doing research into "misinformation" and "fake news" on social media. he also appears to be on fedi! so @Drand@techhub.social, given this activity is sourced from an IP with your name on it, could you share the purpose of this traffic? what data is being collected and how is it being used? do you plan to respect robots.txt or identify yourself in your useragent? is there a process for instance admins to opt out of this activity other than blocking the source IP?
@laurel @p @natalie @graf @Drand Honestly, I gotta make a generic library for this so people do it right, and then me doing it looks just like "approved" people doing it
what's ironic is people are only finding out about MIT scraping fedi is because they're doing it wrong. if you're doing it right, you *need* to announce yourself like a cartoon villain in order for anybody to think it's happening.
everyone's on about "safety" and "security", but let me ask you this: Did the guy who called MGM and ask nicely for them to deploy ransomware cartoonishly announce himself as a social engineer?