Definitely Not Reddit

📝 Description

This is my Reddit clone coding project that began with Ben Awad's epic 14-hour YouTube tutorial and evolved with more functionalities over time. It allows users to create posts, upvote/downvote, and comment, and it supports text and content (image, video, links) posts.

🎬 Demo

Sign Up

Create Post

Leave Comment

🤖 Versions

v1.4 Refactor application for link data storage

v1.3 Add support for image, video, and link posts

v1.2 Add comments functionality

v1.1 Redesign with responsive layouts

v1.0 Deploy site with base functionality (user accounts, text posts)

🚀 Technologies

General

Front End

Back End

Deployment

🧗 Development Challenges

Link Preview Generation

This roller coaster began when I had the idea to add support for content posts in order to expand the usability of the site and allow users to embed images, gifs, videos, and links in their posts.

For the first three items, the process was simple enough: URLs within the post body would be parsed and categorized, then rendered accordingly as an image/video tag or handled by ReactPlayer, which supports YouTube, Vimeo, Twitch, etc.

As for web page links, my goal was to display a card preview, similar to the format on Facebook and Twitter social feeds. This required obtaining the page's meta tag data. The first method I found was a package called link-preview-generator. Because it uses Puppeteer to scrape said page, the implementation had to be server-side.

In retrospect, it was a glaring mistake to position the meta data retrieval process at time of post read instead of post creation. I had wanted to avoid refactoring the database to store link information, relying instead on live scraping everytime a user accessed the site. Always having up-to-date meta information is nice, but the modest server resources made this design absolutely unsustainable. Server load frequently throttled, and repeated scraping was sometimes detected and blocked by website defenses.

Several other issues arose, including configuring Puppeteer to run properly inside a Docker container. While debugging this, I experimented with two other meta data services (URL Meta and LinkPreview) and ultimately added them as backup processes for redundancy.

Eventually, I bit the bullet and refactored the database and link preview component to retrieve and store meta data at time of post creation, though not before encountering another bewildering problem.

Self-Scraping Loop

When I had successfully gotten Puppeteer to work, I deployed the site and created a post announcing the new features and demoing the link preview functionality. In that post, the site I linked was Definitely Not Reddit itself.

Upon the next deployment, everything went bonkers. The web component crashed immediately and error logs showed recurring and unhandled GET requests. The server virtual machine spiked to maximum CPU and memory utilization and locked up with runtime errors.

After a lot of troubleshooting, I realized that the problem was the self-scraping loop caused by the post I made. As the front end was being deployed, Next.js's server-side rendering would attempt to retrieve the very meta data of the site still being deployed. This would cause runaway processes on the back end as scraping repeatedly failed. The result was a total system meltdown.

My solution at the time was to add a loophole on the front end, where a link preview request of the site itself would be intercepted and redirected to the readily available meta tags in the website code. Eventually, this patch was rendered unnecessary by a redesign of the whole process, but it was—suffice it to say—a memorable and challenging puzzle.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
readme		readme
server		server
web		web
README.md		README.md
license.txt		license.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Definitely Not Reddit

📝 Description

🎬 Demo

Sign Up

Create Post

Leave Comment

🤖 Versions

🚀 Technologies

🧗 Development Challenges

Link Preview Generation

Self-Scraping Loop

🗃️ License

About

Languages

License

michaelwlu/definitely-not-reddit

Folders and files

Latest commit

History

Repository files navigation

Definitely Not Reddit

📝 Description

🎬 Demo

Sign Up

Create Post

Leave Comment

🤖 Versions

🚀 Technologies

🧗 Development Challenges

Link Preview Generation

Self-Scraping Loop

🗃️ License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages