I don't like storing non-product data in the Commerce Server catalog system. I know some smart people feel differently, and I respect their experience and opinions, but I've seen some incredible abuse of the catalog system. Here's one of the abuses that helped push me into the "If it's not a saleable product, it doesn't go into the catalog." camp.
Background: I've been working on an extremely large e-commerce project for the past 16 months. We took over the project's architecture, design, and implementation from a Microsoft Commerce Server partner, one that had supposedly built large Commerce Server sites before, only to find their solution was very incomplete and had some very significant problems at scale.
My client has a large movie and music catalog, and licenses information about their products from Muze, a leading media information provider. In the original solution, all of this media information was being stored in a catalog. The catalog contained
- special non-purchasable products representing artists, albums, discs, tracks, etc;
- relationships between non-purchasable products, to associate artists with their albums, albums with their discs, discs with their tracks, and so on; and
- relationships between purchasable and non-purchasable products, to associate music products with the artist and album products that contain the media information for the product.
I'm assuming the original team saw some advantages in doing this:
- They could edit the media information in Catalog Manager. This would have been helpful during development, but the client wasn't going to do this, and it became increasingly difficult to use as the amount of data increased.
- They could stage the media information between environments using CSS. This also would have been helpful, but the client wasn't going to edit or stage this information, so they could have just as easily inserted this information directly into all environments.
- They could use the catalog API for all of their data access. They didn't, but they could have.
- They didn't need a custom database. They still created custom databases for other components, but they wouldn't have needed one for this one.
However, our team had to deal with the disadvantages, which were much more significant:
- The catalogs were huge. An average music product was composed of ~16 different products, plus relationships, including ~12 track products, 1 disc product, 1 album product, 1 artist product, and the actual music product itself. A good-sized music catalog with 100,000 purchasable music products would have translated into ~1.6 million products. The entire Muze catalog would have translated into more than 16 million products.
- We needed to write custom code to maintain the data. We couldn't simply take the media information from Muze and bcp it into a database. We needed to write custom code to transform the media information into new products and relationships or updates to existing products and relationships.
- We needed to write custom code to retrieve the data. We couldn't simply execute a sproc that would return the media information for a specific product. We needed to write custom code to get the product's related artist and album products, then get the album product's related disc products, then get the disc products' related track products, and so on.
- The performance was terrible. Accessing the data for music products was expensive. In some extreme cases it took ~100 queries to retrieve the media information for a single product. Even with multiple layers of caching, the music product detail pages performed terribly, and there was no way to hit our performance targets for those pages.
- We couldn't transition the component to non-CS developers. The only people who could work on the component were developers who were familiar with the catalog and the catalog API.
We wound up rewriting the media information component to use a custom database instead of the catalog. This negated the disadvantages above and greatly improved performance. Unfortunately, because the original component was "code complete", this was unplanned work. If the original team had given more thought to the question of where to store the data, or had a default position of "only products in the catalog", we probably wouldn't have had to spend time and money we did on rewriting this component.
Colin