Can You Fetch External API Data in Jekyll Using Ruby at Build Time
Is It Possible to Pull Live Data from an API into Jekyll?
Yes — you can fetch data from external APIs using Ruby during the Jekyll build process. By writing a Ruby plugin that runs before render, you can:
- Fetch JSON from a REST API
- Save it into
_data/folder - Loop through it in your Liquid templates
The result is fully static HTML with data synced from live APIs at build time.
Use Cases for Fetching External API Data in Jekyll
- Display latest GitHub issues or commits
- Show latest blog posts from another site
- Render pricing, reviews, or ratings from a third-party API
- Pull in CMS data from Notion, Airtable, etc.
Step-by-Step: Fetch External Data via Ruby and Save to _data
1. Create a Plugin File
_plugins/fetch_data.rb
2. Use Ruby to Fetch and Save API JSON
require "net/http"
require "json"
require "fileutils"
Jekyll::Hooks.register :site, :after_init do |site|
uri = URI("https://api.github.com/repos/jekyll/jekyll/issues?per_page=5")
response = Net::HTTP.get(uri)
data = JSON.parse(response)
FileUtils.mkdir_p("_data")
File.write("_data/github_issues.json", JSON.pretty_generate(data))
end
3. Loop Through the Data in a Template
<ul class="gh-issues">
{% for issue in site.data.github_issues %}
<li>
<a href="{{ issue.html_url }}">#{{ issue.number }}: {{ issue.title }}</a>
</li>
{% endfor %}
</ul>
This will render the latest 5 open issues from the Jekyll GitHub repository on every build.
When Does This Run?
The hook :after_init runs before content is rendered. It’s the perfect time to fetch and write data to _data/ folder.
Can You Fetch Multiple APIs at Once?
Yes. You can call multiple APIs, or loop over multiple endpoints dynamically. Example:
repos = ["jekyll", "jekyll-seo-tag", "minima"]
repos.each do |repo|
uri = URI("https://api.github.com/repos/jekyll/#{repo}")
response = Net::HTTP.get(uri)
data = JSON.parse(response)
File.write("_data/#{repo}.json", JSON.pretty_generate(data))
end
This creates a separate JSON file in _data/ for each repository.
Advanced: Cache Responses for Faster Builds
API calls slow down your build. To prevent repeated calls, cache the response:
cache_path = "_cache/issues.json"
if File.exist?(cache_path)
json = File.read(cache_path)
else
json = Net::HTTP.get(uri)
FileUtils.mkdir_p("_cache")
File.write(cache_path, json)
end
data = JSON.parse(json)
File.write("_data/github_issues.json", JSON.pretty_generate(data))
Important Notes and Limits
- GitHub Pages doesn't allow custom Ruby plugins. Use Netlify, GitHub Actions, or a self-hosted Jekyll build.
- Rate limits apply. GitHub’s API is limited to 60 calls/hour for unauthenticated users.
- OAuth tokens can be used via environment variables for private APIs or higher rate limits.
Tips for Production Builds
- Use
ENV['API_KEY']for secret keys - Cache API responses to avoid flakiness
- Run Jekyll via CI tools that support custom Ruby (e.g., Netlify CI or GitHub Actions)
Conclusion
With a bit of Ruby, Jekyll becomes more than a static generator — it can act as a **static frontend for dynamic data**. You pull the content at build time, and ship pure HTML. No client-side JS, no runtime dependencies.
- Safe — content is pre-rendered and secure
- Fast — no runtime queries or API calls
- Powerful — integrates with any external source
Next Steps
- Try fetching GitHub issues or RSS feed
- Render API content with Liquid and data includes
- Use caching to optimize repeat builds
In the next article, we’ll combine this with **pagination, sorting, and search** — showing how to build a dynamic-like index page from external data in a fully static Jekyll site.
