The following is an extract from an assignment I had to do differentiating a data mart from a data warehouse. Feel free to add on or correct the discussion.
A company, lets say Alphabet, may be running a platform with different operations in different applications which produce different information. Each application would store it’s own data that it needs for functional purposes, probably a video playing website would need to store a users history of seen videos, subscriptions to some channels and etc. This video related data my be what the website needs to run on a daily basis, so the website keeps it’s own data mart for such data. It goes without saying that every other application will also have it’s own data mart that stores operational critical data. In this example, this store of data is what we would call a data mart.
Alphabet now has to store all that information in a central location for it’s own enterprise purposes. The data Alphabet needs for its greater purpose will not only be from different data marts, but will also span a stretching time in history, in an inconsistent manner. Alphabet might have to transform some of the data it receives from it’s other applications to fit the scheme which it implements for it’s data warehouse. This transformation may include cleaning the data values, integrating, smoothing out, and alas to get the different persistence system into, a possibly heterogeneous, one. With as much detailed information is it can get for it’s long term analysis for operations. The data stored within Alphabets warehouse are a Goliath compared to the already staggering quantities produced by one of it’s many application (the video website) because it has to also store even more details that it gets by all the other applications.