8000+ SEO Location Pages in AEM Without JCR Nodes: 3 Solutions Compared
8000+ SEO Location Pages in AEM Without JCR Nodes: 3 Solutions Compared
The Context
A client requested dedicated pages for every Italian region, province, and municipality to map local resources and improve SEO rankings.
The Numbers
- 20 regions
- ~110 provinces
- ~8000 municipalities
- Total: ~8130 pages
Requirements
✅ Real server-side HTML (not SPA/React) to be indexed by search engines
✅ Clean SEO-friendly URLs: /locations/lombardia/milano/navigli.html
✅ Dynamic content based on location (region/province/municipality name)
❌ Zero JCR nodes (or bare minimum) to avoid bloating authoring
The Problem
Creating 8130 physical pages in /content would be a disaster:
- ❌ Degraded siteadmin performance
- ❌ Very slow backup/restore
- ❌ Complex deployments
- ❌ Impossible to maintain at scale
- ❌ Risk of timeouts when rendering the tree
Question: How to generate thousands of SEO-friendly HTML pages without creating thousands of JCR nodes?
Solution 1: ISTAT JSON + In-Memory Service (Implemented)
This is the solution I implemented in the real project.
Strategy
- ISTAT Data: Download official ISTAT CSV with regions/provinces/municipalities
- JSON Conversion: Script to convert CSV → structured JSON
- In-memory Service: Load JSON into RAM at AEM startup
- Dispatcher rewrite: Convert clean URLs into Sling selectors
- Filter + Session: Validate location and pass data via session
- Sling Rewriter: Replace HTML placeholders with actual data
1. ISTAT Data - Official Source
Data Source: ISTAT - Codes of Municipalities, Provinces and Regions
ISTAT (Italian National Institute of Statistics) provides the official dataset with all Italian municipalities in CSV format. The file contains:
- Official name (with spaces, apostrophes, special characters)
- ISTAT code
- Province affiliation
- Region affiliation
Problem: Names contain non-URL-safe characters (Valle d'Aosta, Reggio nell'Emilia, Saint-Vincent).
Solution: Conversion script CSV → JSON with automatic URL-safe slug generation.
Target JSON Structure
The ISTAT CSV contains official names with spaces and special characters. I created a script to generate URL-safe slugs:
{
"regioni": [
{
"name": "Valle d'Aosta", // Official name
"slug": "valle-d-aosta", // URL-safe
"province": [
{
"name": "Aosta",
"slug": "aosta",
"comuni": [
{
"name": "Saint-Vincent",
"slug": "saint-vincent"
}
]
}
]
},
{
"name": "Emilia-Romagna",
"slug": "emilia-romagna",
"province": [
{
"name": "Reggio nell'Emilia",
"slug": "reggio-emilia",
"comuni": [
{
"name": "Reggio nell'Emilia",
"slug": "reggio-emilia"
}
]
}
]
}
]
}File saved in: src/main/resources/data/comuni-italia.json
CSV → JSON Conversion Script
// convert-istat-csv.js
// Download CSV from: https://www.istat.it/classificazione/...
const fs = require('fs');
const csv = require('csv-parser');
const results = {
regioni: []
};
const regioniMap = new Map();
const provinceMap = new Map();
// Function to create URL-safe slugs
function slugify(text) {
return text
.toLowerCase()
.normalize('NFD') // Decompose accented characters
.replace(/[\u0300-\u036f]/g, '') // Remove accents
.replace(/['']/g, '') // Remove apostrophes
.replace(/[^a-z0-9]+/g, '-') // Replace non-alphanumeric with -
.replace(/^-+|-+$/g, ''); // Remove leading/trailing -
}
fs.createReadStream('comuni-istat.csv')
.pipe(csv({ separator: ';' }))
.on('data', (row) => {
const regioneName = row['Denominazione regione'];
const provinciaName = row['Denominazione provincia'];
const comuneName = row['Denominazione comune'];
// Skip headers
if (!regioneName || regioneName === 'Denominazione regione') return;
const regioneSlug = slugify(regioneName);
const provinciaSlug = slugify(provinciaName);
const comuneSlug = slugify(comuneName);
// Create region if not exists
if (!regioniMap.has(regioneSlug)) {
const regione = {
name: regioneName,
slug: regioneSlug,
province: []
};
regioniMap.set(regioneSlug, regione);
results.regioni.push(regione);
}
const regione = regioniMap.get(regioneSlug);
// Create province if not exists
const provinciaKey = `${regioneSlug}/${provinciaSlug}`;
if (!provinceMap.has(provinciaKey)) {
const provincia = {
name: provinciaName,
slug: provinciaSlug,
comuni: []
};
provinceMap.set(provinciaKey, provincia);
regione.province.push(provincia);
}
const provincia = provinceMap.get(provinciaKey);
// Add municipality
provincia.comuni.push({
name: comuneName,
slug: comuneSlug
});
})
.on('end', () => {
// Save JSON
fs.writeFileSync(
'comuni-italia.json',
JSON.stringify(results, null, 2),
'utf-8'
);
console.log('✅ Conversion completed!');
console.log(` Regions: ${results.regioni.length}`);
console.log(` Provinces: ${provinceMap.size}`);
let totaleComuni = 0;
results.regioni.forEach(r =>
r.province.forEach(p =>
totaleComuni += p.comuni.length
)
);
console.log(` Municipalities: ${totaleComuni}`);
});Execution:
npm install csv-parser
node convert-istat-csv.js
# Output: comuni-italia.json (~1.5 MB)Generated slug examples:
Valle d'Aosta→valle-d-aostaReggio nell'Emilia→reggio-emiliaBolzano/Bozen→bolzano-bozenForlì-Cesena→forli-cesena
2. In-Memory Service with Nested JSON
@Component(service = LocationDataService.class, immediate = true)
public class LocationDataServiceImpl implements LocationDataService {
private static final Logger LOG = LoggerFactory.getLogger(LocationDataServiceImpl.class);
private JSONObject locationsData;
@Activate
protected void activate() throws Exception {
LOG.info("Loading ISTAT locations data...");
// Load JSON from classpath
InputStream jsonStream = getClass().getResourceAsStream("/data/comuni-italia.json");
String jsonString = IOUtils.toString(jsonStream, StandardCharsets.UTF_8);
// Parse native JSON (org.json or Gson)
this.locationsData = new JSONObject(jsonString);
// Count total locations for logging
int count = countLocations(locationsData);
LOG.info("✅ Loaded {} locations in memory (~{}MB)", count,
Runtime.getRuntime().totalMemory() / 1024 / 1024);
}
@Override
public LocationData get(String regionSlug, String provinciaSlug, String comuneSlug) {
try {
// Hierarchical lookup: regions → province → municipality
JSONArray regioni = locationsData.getJSONArray("regioni");
// Find region
JSONObject regione = findBySlug(regioni, regionSlug);
if (regione == null) return null;
// Find province
JSONArray province = regione.getJSONArray("province");
JSONObject provincia = findBySlug(province, provinciaSlug);
if (provincia == null) return null;
// Find municipality
JSONArray comuni = provincia.getJSONArray("comuni");
JSONObject comune = findBySlug(comuni, comuneSlug);
if (comune == null) return null;
// Build LocationData
return new LocationData(
comune.getString("name"),
provincia.getString("name"),
regione.getString("name")
);
} catch (JSONException e) {
LOG.warn("Location not found: {}/{}/{}", regionSlug, provinciaSlug, comuneSlug);
return null;
}
}
@Override
public boolean exists(String regionSlug, String provinciaSlug, String comuneSlug) {
return get(regionSlug, provinciaSlug, comuneSlug) != null;
}
// Helper: find object in array by slug
private JSONObject findBySlug(JSONArray array, String slug) throws JSONException {
for (int i = 0; i < array.length(); i++) {
JSONObject obj = array.getJSONObject(i);
if (slug.equals(obj.getString("slug"))) {
return obj;
}
}
return null;
}
private int countLocations(JSONObject data) throws JSONException {
int count = 0;
JSONArray regioni = data.getJSONArray("regioni");
for (int i = 0; i < regioni.length(); i++) {
JSONArray province = regioni.getJSONObject(i).getJSONArray("province");
for (int j = 0; j < province.length(); j++) {
count += province.getJSONObject(j).getJSONArray("comuni").length();
}
}
return count;
}
}Nested approach advantages:
- ✅ Simplicity: Mirrors the natural data hierarchy
- ✅ No composite keys: Direct navigation region → province → municipality
- ✅ Memory footprint: ~2-5 MB (identical to HashMap)
- ✅ Performance: O(1) for region + O(1) for province + O(1) for municipality = <1ms
Note: The findBySlug() array lookup is O(n), but given:
- Regions = 20 → O(20) = negligible
- Provinces per region ≤ 12 → O(12) = negligible
- Municipalities per province = variable, but in-memory lookup is blazing fast
Total performance: ~0.5-1ms (same as flat HashMap)
3. Dispatcher Rewrite Rules
Goal: Convert SEO-friendly URLs into Sling selectors
# dispatcher.any or httpd vhost
# Region + Province + Municipality
RewriteRule ^/locations/([a-z0-9-]+)/([a-z0-9-]+)/([a-z0-9-]+)\.html$ \
/content/mysite/locations.$1.$2.$3.html [PT,L]
# Region + Province only
RewriteRule ^/locations/([a-z0-9-]+)/([a-z0-9-]+)\.html$ \
/content/mysite/locations.$1.$2.html [PT,L]
# Region only
RewriteRule ^/locations/([a-z0-9-]+)\.html$ \
/content/mysite/locations.$1.html [PT,L]Examples:
/locations/lombardia/milano/navigli.html→/content/mysite/locations.lombardia.milano.navigli.html/locations/emilia-romagna/bologna.html→/content/mysite/locations.emilia-romagna.bologna.html
4. Sling Filter - Validation and Session
@Component(
service = Filter.class,
property = {
EngineConstants.SLING_FILTER_SCOPE + "=" + EngineConstants.FILTER_SCOPE_REQUEST,
ServiceConstants.SERVICE_RANKING + ":Integer=1000"
}
)
public class LocationDataFilter implements Filter {
private static final String SESSION_KEY = "location.current.data";
private static final String LOCATION_PAGE_PATH = "/content/mysite/locations";
@Reference
private LocationDataService locationService;
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException {
SlingHttpServletRequest slingRequest = (SlingHttpServletRequest) request;
// Check if it's a location page
String path = slingRequest.getRequestPathInfo().getResourcePath();
if (path.startsWith(LOCATION_PAGE_PATH)) {
// Read selectors: .lombardia.milano.navigli
String[] selectors = slingRequest.getRequestPathInfo().getSelectors();
if (selectors.length >= 1) {
String regionSlug = selectors[0];
String provinciaSlug = selectors.length >= 2 ? selectors[1] : null;
String comuneSlug = selectors.length >= 3 ? selectors[2] : null;
// Determine location type and validate
LocationData locationData = null;
if (comuneSlug != null) {
// Municipality
if (locationService.exists(regionSlug, provinciaSlug, comuneSlug)) {
locationData = locationService.get(regionSlug, provinciaSlug, comuneSlug);
}
} else if (provinciaSlug != null) {
// Province (similar logic)
} else {
// Region (similar logic)
}
if (locationData != null) {
// Save to session for template
slingRequest.getSession().setAttribute(SESSION_KEY, locationData);
} else {
// Location not found → 404
((SlingHttpServletResponse) response).sendError(404, "Location not found");
return;
}
}
}
chain.doFilter(request, response);
}
}5. Template Page Component
A single template page in /content/mysite/locations reused for all 8130 locations.
Sling Model:
@Model(adaptables = SlingHttpServletRequest.class)
public class LocationPageModel {
private static final String SESSION_KEY = "location.current.data";
@SlingObject
private SlingHttpServletRequest request;
public LocationData getLocationData() {
return (LocationData) request.getSession().getAttribute(SESSION_KEY);
}
public String getPageTitle() {
LocationData data = getLocationData();
return data != null ? "Resources in " + data.getComuneName() : "Location";
}
}HTL Template:
<div data-sly-use.model="com.mysite.models.LocationPageModel">
<h1>{{COMUNE_NAME}}</h1>
<p>Province: {{PROVINCIA_NAME}}</p>
<p>Region: {{REGIONE_NAME}}</p>
<div class="resources">
<!-- Standard AEM components for content -->
</div>
</div>6. Sling Rewriter - Placeholder Replacement
Rewriter Pipeline Configuration:
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0"
xmlns:jcr="http://www.jcp.org/jcr/1.0"
jcr:primaryType="sling:Folder">
<location-placeholder-rewriter
jcr:primaryType="nt:unstructured"
enabled="{Boolean}true"
generatorType="htmlparser"
order="{Long}1"
serializerType="htmlwriter"
transformerTypes="[location-placeholder-replacer]"
paths="[/content/mysite/locations]"/>
</jcr:root>Transformer Implementation:
@Component(
service = TransformerFactory.class,
property = {
"pipeline.type=location-placeholder-replacer"
}
)
public class LocationPlaceholderTransformerFactory implements TransformerFactory {
@Override
public Transformer createTransformer() {
return new AbstractTransformer() {
private LocationData locationData;
@Override
public void init(ProcessingContext context, ProcessingComponentConfiguration config) {
// Retrieve data from session
SlingHttpServletRequest request = context.getRequest();
locationData = (LocationData) request.getSession()
.getAttribute("location.current.data");
}
@Override
public void characters(char[] chars, int offset, int length) throws SAXException {
if (locationData != null) {
String text = new String(chars, offset, length);
// Replace placeholders
text = text.replace("{{COMUNE_NAME}}", locationData.getComuneName());
text = text.replace("{{PROVINCIA_NAME}}", locationData.getProvinciaName());
text = text.replace("{{REGIONE_NAME}}", locationData.getRegioneName());
char[] replaced = text.toCharArray();
super.characters(replaced, 0, replaced.length);
} else {
super.characters(chars, offset, length);
}
}
};
}
}Final HTML result:
<!-- HTL template had: -->
<h1>{{COMUNE_NAME}}</h1>
<!-- Browser receives: -->
<h1>Navigli</h1>Solution 1 Advantages
✅ Zero JCR nodes (only the template /content/mysite/locations)
✅ Excellent performance (in-memory HashMap, <1ms lookup)
✅ Perfect SEO (clean URLs, server-side HTML)
✅ Official ISTAT data (authoritative and updated source)
✅ Simple to implement (uses native AEM features)
✅ Dispatcher cacheable (once rendered, cached for hours)
Solution 1 Disadvantages
❌ Hard-coded data in bundle (update = redeploy) ❌ Not authorable (authors don't see pages in siteadmin) ❌ Manual sitemap (requires custom servlet to generate it)
Solution 2: /var Nodes with Tuples + Cache
Alternative: Store data under /var instead of in-memory.
/var Structure
/var/locations (nt:unstructured)
/lombardia (nt:unstructured)
name = "Lombardia"
slug = "lombardia"
/milano (nt:unstructured)
name = "Milano"
slug = "milano"
comuni (String[]) = [
"milano|Milano",
"monza|Monza",
"rho|Rho",
"sesto-san-giovanni|Sesto San Giovanni",
...
]Total nodes: ~130 (20 regions + 110 provinces)
Municipalities: Saved as tuple arrays "slug|Official Name"
Service with Cache
@Component(service = LocationDataService.class)
public class LocationDataServiceImpl implements LocationDataService {
@Reference
private ResourceResolverFactory resolverFactory;
// Cache only ~110 provinces
private LoadingCache<String, ProvinciaData> provinciaCache;
@Activate
protected void activate() {
this.provinciaCache = CacheBuilder.newBuilder()
.maximumSize(150)
.expireAfterWrite(1, TimeUnit.HOURS)
.build(new CacheLoader<String, ProvinciaData>() {
@Override
public ProvinciaData load(String key) {
return loadProvinciaFromJcr(key);
}
});
}
@Override
public LocationData get(String regionSlug, String provinciaSlug, String comuneSlug) {
String key = regionSlug + "/" + provinciaSlug;
ProvinciaData provincia = provinciaCache.get(key);
// Search municipality in tuple array
String tuple = provincia.getComuni().stream()
.filter(t -> t.startsWith(comuneSlug + "|"))
.findFirst()
.orElse(null);
if (tuple != null) {
String[] parts = tuple.split("\\|");
return new LocationData(parts[1], provincia.getName(), regionSlug);
}
return null;
}
private ProvinciaData loadProvinciaFromJcr(String key) {
try (ResourceResolver resolver = getServiceResolver()) {
Resource res = resolver.getResource("/var/locations/" + key);
if (res != null) {
ValueMap props = res.getValueMap();
return new ProvinciaData(
props.get("name", String.class),
Arrays.asList(props.get("comuni", String[].class))
);
}
}
return null;
}
}ISTAT Import Script → /var
public void importIstatToVar(InputStream jsonStream) throws Exception {
try (ResourceResolver resolver = getServiceResolver()) {
ObjectMapper mapper = new ObjectMapper();
IstatData data = mapper.readValue(jsonStream, IstatData.class);
Resource varRoot = resolver.getResource("/var");
Resource locationsRoot = resolver.create(varRoot, "locations",
Map.of("jcr:primaryType", "nt:unstructured"));
for (RegionData region : data.getRegioni()) {
Resource regionNode = resolver.create(locationsRoot, region.getSlug(),
Map.of("name", region.getName(), "slug", region.getSlug()));
for (ProvinciaData provincia : region.getProvince()) {
// Create tuple array "slug|name"
String[] comuniTuples = provincia.getComuni().stream()
.map(c -> c.getSlug() + "|" + c.getName())
.toArray(String[]::new);
resolver.create(regionNode, provincia.getSlug(),
Map.of(
"name", provincia.getName(),
"slug", provincia.getSlug(),
"comuni", comuniTuples
));
}
}
resolver.commit();
LOG.info("✅ Import completed: ~130 /var nodes created");
}
}Solution 2 Advantages
✅ Persistent data (not lost on restart) ✅ Simple updates (modify /var nodes, no redeploy) ✅ Intelligent cache (only accessed provinces) ✅ Automatic backup (part of repository) ✅ JCR queries possible (if needed)
Solution 2 Disadvantages
❌ ~130 additional nodes in JCR (minimal, but still present) ❌ Slightly slower than pure in-memory (cache miss = JCR query) ❌ Requires initial import script
Solution 3: External Backend + API Cache
Alternative: Data on external backend (microservice, DB, headless CMS).
Architecture
AEM → HTTP Client → Backend API → ISTAT Database
↓
Cache Layer
(Redis/Memcached)Service with HTTP Client
@Component(service = LocationDataService.class)
public class LocationDataServiceImpl implements LocationDataService {
@Reference
private HttpClient httpClient;
private static final String API_BASE = "https://api.mycompany.com/locations";
// Guava in-memory cache
private LoadingCache<String, LocationData> cache;
@Activate
protected void activate() {
this.cache = CacheBuilder.newBuilder()
.maximumSize(1000) // Cache 1000 most accessed locations
.expireAfterWrite(6, TimeUnit.HOURS)
.build(new CacheLoader<String, LocationData>() {
@Override
public LocationData load(String key) throws Exception {
return fetchFromBackend(key);
}
});
}
@Override
public LocationData get(String regionSlug, String provinciaSlug, String comuneSlug) {
String key = regionSlug + "/" + provinciaSlug + "/" + comuneSlug;
return cache.get(key);
}
private LocationData fetchFromBackend(String key) throws Exception {
String apiUrl = API_BASE + "/" + key + ".json";
HttpResponse response = httpClient.execute(new HttpGet(apiUrl));
if (response.getStatusLine().getStatusCode() == 200) {
String json = EntityUtils.toString(response.getEntity());
return new ObjectMapper().readValue(json, LocationData.class);
}
return null;
}
}Backend API Endpoint (Node.js example)
// Express.js API
app.get('/locations/:region/:provincia/:comune.json', async (req, res) => {
const { region, provincia, comune } = req.params;
// Query database
const location = await db.query(
'SELECT * FROM comuni WHERE slug = ? AND provincia_slug = ?',
[comune, provincia]
);
if (location) {
res.json(location);
} else {
res.status(404).json({ error: 'Not found' });
}
});Solution 3 Advantages
✅ Centralized data (shareable across multiple systems) ✅ Real-time updates (no AEM redeploy) ✅ Independent scalability (backend can scale separately) ✅ Dynamic data (can change frequently)
Solution 3 Disadvantages
❌ External dependency (if backend down, AEM fails) ❌ Network latency (even with cache, first access is slow) ❌ Architectural complexity (more systems to maintain) ❌ Infrastructure costs (database, API server, Redis cache)
Comparison of 3 Solutions
| Aspect | Sol. 1: In-Memory JSON | Sol. 2: /var Nodes | Sol. 3: Backend API |
|---|---|---|---|
| Performance | 🚀 Very fast (<1ms) | ⚡ Fast (2-5ms) | 🐢 Depends (10-50ms first hit) |
| JCR Nodes | ✅ Zero | ⚠️ ~130 | ✅ Zero |
| Data Update | ♻️ Bundle redeploy | ✏️ Update nodes | ⚡ Real-time |
| Persistence | ❌ Lost on restart | ✅ Persistent | ✅ Persistent |
| Dependencies | ✅ Zero | ✅ Only AEM | ❌ External backend |
| Complexity | 🟢 Low | 🟡 Medium | 🔴 High |
| Costs | 💰 Zero | 💰 Zero | 💰💰 External infra |
| Scalability | ⚠️ Limited (RAM) | ✅ Good | ✅ Excellent |
| Authoring | ❌ No | ❌ No | ❌ No |
Conclusions and Recommendations
When to use Solution 1 (In-Memory JSON)
✅ Small/medium projects (~10k locations max) ✅ Static data (rarely changes) ✅ Limited budget (no extra infrastructure) ✅ Fast time-to-market (quick implementation)
When to use Solution 2 (/var Nodes)
✅ Enterprise projects with strong governance ✅ Periodically changing data (monthly updates) ✅ Team prefers JCR as source of truth ✅ Backup/restore important (part of repository)
When to use Solution 3 (Backend API)
✅ Shared data across multiple systems (AEM, mobile app, etc.) ✅ Frequent real-time updates ✅ Maximum scalability (millions of locations) ✅ Microservices architecture already in place
My Final Choice
In the real project I chose Solution 1 (In-Memory JSON) because:
- ✅ Stable ISTAT data (updated 1-2 times per year)
- ✅ Critical performance (high-traffic e-commerce)
- ✅ Limited budget (no external backend)
- ✅ 8130 locations = ~5 MB RAM (negligible)
- ✅ Fast implementation (2 days development)
Results
- 🚀 8130 SEO pages with zero JCR nodes
- ⚡ <5ms TTFB (Time To First Byte)
- 📈 +300% organic traffic in 6 months
- 💾 Dispatcher cache hit >95%
- 👨💻 Zero overhead for authors
Sitemap.xml Bonus
For SEO, a sitemap with all 8130 URLs is needed:
@Component(service = Servlet.class, ...)
public class LocationSitemapServlet extends SlingSafeMethodsServlet {
@Reference
private LocationDataService locationService;
@Override
protected void doGet(SlingHttpServletRequest request,
SlingHttpServletResponse response) throws IOException {
response.setContentType("application/xml");
PrintWriter out = response.getWriter();
out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
out.println("<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">");
// Loop through all locations
for (LocationData location : locationService.getAllLocations()) {
out.printf("<url><loc>%s</loc><priority>0.7</priority></url>%n",
location.getUrl());
}
out.println("</urlset>");
}
}Endpoint: /locations-sitemap.xml
Submit to: Google Search Console
Key Takeaways
- 8000+ SEO pages in AEM without JCR nodes is possible
- Dispatcher rewrite + Selectors = clean URLs
- Sling Rewriter = powerful but underused feature
- ISTAT = official source for Italian geographic data
- In-memory HashMap = unbeatable performance
- Solution choice depends on: budget, update frequency, scalability
Have you implemented similar solutions? Which approach did you choose? Share your experience in the comments!
Related articles: