The essentials of central log collection with WEF and WEC
This blog discusses, mentions, or contains links to an Elastic training program that is now retired. For more Elastic resources, please visit the Getting Started page.
Last week we covered the essentials of event logging: Ensuring that all your systems are writing logs about the important events or activities occurring on them. This week we will cover the essentials of centrally collecting these Event Logs on a Window Event Collector (WEC) server, which then forwards all logs to Elastic Security.
WEF and WEC
Modern versions of Windows include the Windows Remote Management (WinRM) services that implement the WS-Management (WSman) protocol, and just to add to the acronym spaghetti soup, this is all part of Windows Management Instrumentation (WMI). One component of WinRM is the Windows Event Forwarding (WEF) service, this is why WinRM and co. need to be enabled. WEF can forward Windows Event Logs to a Windows Server running the Windows Event Collector (WEC) service.
There are two modes of forwarding:
- Source Initiated: The WEF service connects to the WEC server
- Collector Initiated: The WEC service connects to the WEF service
Both use WSman to forward the logs and require WinRM to be running.
There are a number of pitfalls and hurdles when setting up WEF and WEC. Following our WEC Cookbook, you can avoid these. However, for a higher-level view with richer context, we will discuss them here, as well as the solution taken in the Cookbook.
‘Forwarded Events’ event log file
In the Windows Event Log system there are Channels. These Channels are ultimately backed by an event log file that stores all the event logs written to that Channel. A Windows system comes with a set of predefined Channels and applications can add their own Channels by registering new “Providers.”
This means that out of the box, a WEC server only has the Channels that a normal Windows server has anyway for its own logs. Then where should one store all the logs that are being forwarded to the WEC server? There are three options; let's look at them:
1. Store in the local Channel matching the remote Channel (i.e., the remote “Security” Channel events are stored in the WEC’s local “Security” Channel).
Pitfalls:
- All your remote logs are mixed with your local logs
- The WEC server may loop its own event logs to this Channel
- Log management and access control are made very difficult
2. Store all the remote logs in the local “Forwarded Events” Channel.
Pitfalls:
- Poor write performance, since all writes are to a single file
- Poor search/read performance, as events are not partitioned in separate files
- Poor data life cycle management, as this is per log file, therefore all forwarded events are treated as equal
- Poor resource utilisation of the WEC server, because all work is bottle-necked to a single file
- Poor access management, separate files would allow differentiated file access controls
- Poor coverage/visibility — due to the issues above, many heavily restrict what event logs are forwarded, leaving gaps in their visibility
3. Create new Channels for the WEC server.
This is not as obvious as it might seem, and most would be forgiven for not knowing that it was an option.
Many WEC servers have been set up with options 1 or 2 (above), until Microsoft's own internal Security team wrote a blog post (about 15 years ago) on how they used the Windows SDK to implement option 3. Here is a similar revision posted in 2016.
New WEC event Channels
Armed with the ability to create arbitrary event Channels, what should we create? How should we organise and architect our WEC server? There are many schools of thought — the WEC Cookbook groups enterprise assets together so that you can manage your log’s access control and data lifecycle accordingly.
Before we delve deeper, let’s look at another approach. Some of you might have come across Palantir’s WEC architecture and guideline. Here, they created a Channel per event log type: Powershell, WMI, DNS, Firewall, etc. These Channels contain logs from all asset types (Domain Controllers, Domain Server, Domain Workstation) and Departments/Biz-Units/OU. They’re also not organised into a hierarchy, so they’d just be a long list in Event Viewer. The Channels therefore also have their WEC subscriptions. Palantir also has a recommended audit policy.
I like to point out other approaches, such as this one from Palantir, as not one size fits all and their approach might suit your organisation better then the one set out in our Cookbook.
If you have looked at Palantir’s Channel list, you will have noticed their “WEC#-Something” format, where the number ‘#’ increases every seven Channels. This is because in the Windows Event Log system, Channels are defined by what is known as a “Provider” and can only define up to eight Channels:
However, an off-by-bug in the “ecmangen” tool that all of us non-Windows-SDK-developer security people used made it frustrating to have more than seven Channels per Provider in it.
Instead of fixing the many bugs in it, it appears that Microsoft dropped ecmangen from the Windows SDK. Meaning you either use an older SDK or create the Manifest XML file yourself — perhaps with your favourite XML editor.
Like everyone else, I used ecmangen (originally) and stuck to seven Channels to make life easier. The current Cookbook is based on PowerShell scripts that generate the XML. Ecmangen is no longer needed, so you are free to have eight Channels if you want, although the Cookbook still only uses seven recommended Channels.
Note: Think of a Provider as a box of eight Channels, where each Channel is ultimately a separate log file.
Organise by asset
Taking advantage of the fact that you can have as many Providers as you want, we can use this to organise by asset. In your AD environment, you have probably already grouped domain members by asset type (e.g., Domain servers, Domain Controllers, workstations, etc.) and/or by the department (dare I say organisational unit) that those assets are in.
So in the Cookbook you can easily create Providers to match your AD’s organisation, along the lines of Providers per department (Biz Unit or OU) and/or per Asset type and/or per Asset criticality (Lab/Test/Production).
When you seperate by asset type you get the added benefit of being able to better manage access control and log life cycle. If you need to look at the logs of a specific asset type you know where they are.
To make things simple all Providers then get the same set of (up to eight Channels). Then systems are mapped to a Provider (via OU) and the event logs on that system map to Channels in that Provider.
Out of the box, the “wec_config.ps1” script that is used to configure the WEC server to match your AD architecture has some Providers and asset assignment defined:
- Domain Controllers: I think member assignment is clear here
- Domain Servers: The servers in your domain
- Domain Clients: User workstations (desktop/laptops)
- Domain Privileged: More privileged systems (e.g., Jumphosts or WEC Server)
- Domain Members: Catch-all for normal domain members; not in the other groups
- Domain Misc: Miscellaneous, for those hosts that don't fit
You are encouraged to refine and edit the list to match your AD environment.
Then the out-of-the-box Channel list is:
- Application: “Application” and similar logs
- Security: “Security” and similar logs
- Sysmon: “Microsoft-Windows-Sysmon/Operational”
- System: “System”, “HardwareEvents”, DNS-Client, DHCP-Client, “Setup” and similar logs
- Script: “Windows PowerShell” and similar logs
- Service: DNS-Server, DHCP-Server, and other service logs
- Misc: Any other miscellaneous logs
Again, you are free to change this to suit your needs.
WEC subscriptions
A WEC subscription defines the following:
- An event log (XPath) filter, selecting what events should be forwarded
- A destination Channel, stating where to store the received events on the WEC server
- Type:
- Collector Initiated, the WEC connects to the WEF service
- Target computers, a list of computers to connect to
- Source Initiated, the WEF connects to the WEC server
- Computer groups, the AD groups whose (computer) members may access this subscription
- Collector Initiated, the WEC connects to the WEF service
- Event delivery options to control bandwidth/latency and/or HTTP/HTTPS
- Format type: RenderedText or just the Event XML
The cookbook scripts (notably setup_subscriptions.ps1) configure the Event XML format type. These are much smaller, so it means more throughput, more logs stored, less load, and less bandwidth. However, the drawback is that if the source Provider (on the remote system) is not registered locally on the WEC server’s Event Log System, then the Event Viewer won't be able to display a text description of the message in your local language. However, sending Rendered Text events is so resource intensive that it’s difficult to justify.
I mentioned at the start that WEF is a function of WinRM. Well, this WinRM component runs as the local system’s “Network Service” user. This means that WEF can’t actually read most of your system logs, and your WEC server will receive a very nondescript Event ID 111 message and no other logs. For this reason, the Cookbook guides you through creating a GPO to add “Network Service” to the local “Event Log Readers” group.
How does WinRM get the configuration for WEF? In the same GPO as mentioned above, we also publish a WSman URL that lists all the WEC subscriptions on that server. In fact, we can list multiple WSman subscriptions URLs from multiple WEC servers, and the WEF service will try to get and execute them all — thus allowing for redundant WEC servers.
All the subscriptions? I don’t want my Workstation sending event logs to my Domain Controller log files! The WSman entries that represent a subscription have AD group permissions applied to them as set up in the Subscription configuration. This means if the computer that WEF is running on is not a member of an AD group that has permission to read the subscription, it can’t get and execute the subscription. It also means that if you’re not careful and a computer is a member of more than one WEC subscription AD group, you will get multiples of the same event log from that WEF host on your WEC!
Computer Groups? But I want to map computers based on the OU that they are placed in! Unfortunately, that is not how WEF/WEC/WinRM/WSman work. However, the Cookbook provides a mechanism to keep a given group’s membership in sync with specified OU locations. Thus you can pretend everything is done via OU!
Bringing it all together
There is a lot of complexity and moving parts to get right when setting up WEF & WEC for good observability or security use cases.
Fear not, however, for our Cookbook is here, accompanied by a set of Powershell scripts to automate most of the steps. This means there is less room for mistakes, actions are reproducible, and thus mistakes are more easily fixable.
It all starts with the wec_config.ps1 script, which you are expected to edit to your heart's content. All the following scripts will take their lead from that. Therefore, you can easily, for example, change the event log filter used to select what event logs get forwarded in wec_config.ps1, and then re-run setup_subscriptions.ps1 to apply the change.
Let's take a look at what the scripts do (the Cookbook goes into far more detail on how to use them):
- wec_config.ps1 - The configuration of your WEC server, sourced by the other scripts
- gen_manifest.ps1 - This will create the Manifest XML that describes all your Providers and their Channels for the Windows SDK (no need to use ecmangen anymore!)
- build_man2dll.ps1 - Taking your manifest, this will build the Windows Event Subsystem Module DLL that implements all your new Providers and Channels on any system you install it (usually the WEC server)
- install_channels.ps1 - Takes the DLL and Manifest and installs them on the Local system
- configure_channels.ps1 - Will apply the Log Path and Log Size configuration (from wec_config.ps1) to all your newly installed Channels
- setup_subscriptions.ps1 - Will setup (create or reconfigure) all the subscriptions for your Provider/Channels on the WEC server
- map_ou2group.ps1 - You probably want to use your AD’s OUs, but WEC Subscriptions select computers via AD Groups. This script will sync the membership of given groups to the computers under specified OUs, again using the configuration in wec_config.ps1
- gen_winlogbeat_config.ps1 - The config that ships with Winlogbeat won’t know about all your extra WEC subscription Channels, so this will update that configuration for you
- beat_cmd.ps1 - A helper script for interacting with Beat commands on PowerShell
Unfortunately, all the configuration on the AD side, such as Group Policies, still have to be done manually — but the Cookbook has step-by-step guides with screenshots. I may write scripts for that too one day, stay tuned.
Finally Winlogbeat needs to be configured to send all the WEC logs to Elastic Security. The cookbook will guide you through this too.
Conclusion
I hope after reading this blog post, and possibly the Cookbook itself, you have a good idea of the decisions you need to make before you start, as well as now having all the guidance and tools you need to create that perfect WEC server for your enterprise.
Now that you have the proper audit policies in place, WEF configured, and a WEC server setup to forward your AD domain’s event logs to Elastic Security, in our following blog post we will look at what you can do with this extremely important and useful log data in Elastic Security.
If you’re new to Elastic Security, you can experience our latest version on Elasticsearch Service on Elastic Cloud. Also be sure to take advantage of our Quick Start training to set yourself up for success.
See other Cookbook guides that I have written: https://ela.st/tjs-cookbook-lib