The question of when a server shuts down can depend on a number of things.
The OLE object that gets placed on the clipboard has interfaces that must be released. From the perspective of the OLE Server, the windows clipboard is just another consumer that must call IUnknown::AddRef() and IUnknown::Release(), which is what happens when the OLE object is placed on the clipboard and removed, respectively.
If the server is loaded into the host process (an 'in-process' server) then even without putting anything on the clipboard, the server will not be unloaded immediately. That does not happen until some code in the host process makes a call to CoFreeUnusedLibraries(), or CoFreeUnusedLibrariesEx(), which will query each loaded COM server and ask it if it's ok to unload it (by calling the server's DllCanUnloadNow() method). If the call returns S_TRUE, then OLE unloads the server.
So, the point when an in-process server is actually unloaded from the host process depends on when the host process calls CoFreeUnusedLibraries(), which AutoCAD (at least) does periodically every few minutes. Those of us that have debugged in-process COM servers running in AutoCAD are more than vaguely familiar with this, since it prevents us from rebuilding our project because the project DLL cannot be overwritten while it is loaded into a host process. Usually, I will not wait for AutoCAD to call CoFreeUnusedLibraries() and will just terminate and restart it instead.
If the server is not running in-process then it depends on the server and cache policy, but from what I've observed, IE components, including the HTML component, are by-design, left running so that if the user repeatedly starts an instance of the IE browser with no other instances running, they do not have to wait as long as they must the first time it is started in the Windows session. IOW, the delay you see the first time your code runs is that very problem. If the server was being unloaded when there are no longer any dependents, then each time a dependent is started, the server would have to be reloaded, and there would be a long delay. Microsoft knows that users don't savor that, so their solution is to keep some components running even when nothing is using them, so that they do not have be reloaded each and every time something that does need them is started.