Alert Management
In our organization, we already use Freshservice for IT Service Management. But we also have an IT Ops group that includes internal IT Operations, as well as Service Desk working together. We do, however, also have a DevOps team that we would like to better integrate with Freshservice as our single point of contact. Can Freshservice accommodate this setup to include the above-mentioned teams also?
Answer: Yes, we should be able to accommodate such a setup. Learn more.
How to automatically associate an asset in Freshservice inventory to an incoming alert for the same asset?
Answer: Currently, the 'Resource' field is the primary key for detecting and mapping an asset to the incident
How can Freshservice handle several alerts from the same location and make sure all devices are up before the ticket is closed without flooding the IT department? For example a Firewall goes down, and all access points and switches generate alarms.
Answer: In this scenario, different alerts coming in from different devices will be associated with the same incident. This association can be done manually, or our Freddy enabled automated grouping should be able to do it for you. Therefore, if you have 10 devices and 10 alerts then all of them get rolled up into a single incident. Your IT team essentially needs to work on that one incident. The incident will be automatically resolved only when all the 10 associated alerts are resolved.
Will those same notifications for on-call be in the Major Incident Management process once that is released as well?
Answer: Yes, you will be able to use on-call management notifications in the Major Incident Management process.
Do we have plans to increase the amount of alert tags in an alert?
Answer: Yes, we have this as part of our immediate roadmap and you’ll hear about this soon in the upcoming months.
In your terminology, what is the difference between a ‘Problem’ and a ‘Major Incident’?
Answer: The objective of a Major Incident Management process is to restore the issue back to normal as soon as possible whereas the objective of Problem Management is to find the root cause of the issue.
Major Incident Management process aims to resolve issues faster whereas Problem Management tries to prevent recurring issues in the future.
To sum it up, a Problem is the underlying issue that can result in a single or multiple incidents or a Major Incident. A Major Incident is an event with a massive impact on your customers and business that needs to be resolved immediately. An incident/ Major Incident is the issue that the customer is facing while the Problem is the underlying cause.
We are using Email Monitoring to permit third-party web-bases (SaaS) vendors to inform us of maintenance windows, service degradations, and more. We find that email alerts are hit and miss for auto-grouping. Would RegEx controls improve that?
Answer: Regex controls will give you greater control and flexibility over alert grouping criteria. Previously, only emails with the same subject were grouped together. Regex controls allow you to extract values from emails and use those values as grouping criteria for email integrations.
What happens if just 1 of 10 (pay loads/devices) goes up – will the ticket be closed?
Answer: If every device has a separate alert i.e 10 alerts are associated with 1 incident, then the incident will be automatically resolved only if all 10 alerts are resolved.
Are there plans to ever offer workflow automation for alerts directly without needing to create a ticket for the alert first?
Answer: Yes. This is in our roadmap. If you have a use-case that needs workflow automation for alerts, please write to [email protected] describing this use-case.
Service Health Monitoring
How are "Potential Services" detected? We've defined VERY few services (six so far) - yet have no recommendations.
Answer: While configuring a monitoring tool Potential Services are detected by choosing a payload attribute that contains the service name in Alert Payload. You can leave the payload as such or extract a specific part of the payload using a Regular Expression, so that any alert with this payload will be passed through the Regular expression and the Potential services will be detected.
Ex : Payload Attribute : resource
Payload details from Alert
“resource” : “bookings-database”
No Regular Expression defined and in this case the Potential Service will be Bookings Database
If you manually change the state of a service and another alert comes in, will it change it back? How long does it take to change when the alert comes in?
Answer: After manually changing the status of the service to say, Operational, if an alert comes in and also creates an incident, then the status of the Service will be changed back to Needs Attention.
How to configure services when one monitoring tool is mapped to multiple services? Can the services be manually created and mapped to the monitoring tool? Or is it necessary to process an alert and 'discover' the services via Potential Services?
Answer: When one monitoring tool is mapped to multiple services, the only way to configure those services is to process an alert and discover them through Potential Services.
Is this going to be integrated with the upcoming Major Incident Management enhancement? If so, are there any 'best practices' to follow when creating services and monitoring tools with this module?
Answer: Yes, services will be closely integrated with Major Incident Management. In the Major Incident Management module we will have a section named Impacted Services which will display the services impacted by the major incident. For now, the practices followed for monitoring tool integration with Service Health Monitoring should hold good.
How can we adopt the SRE model with this function?
Answer: SRE teams can adopt to this model by
Identifying the services in their infrastructure (Manually or Automated using Service Health Monitoring Potential Services)
Configuring the service maps and dependencies to identify relationships among services
Continuously monitor the services using monitoring tool integrations, determine the health status, and act accordingly
It seems that most of these services and alert integration pieces are tied to cloud services. Are there plans to integrate more on prem software for this?
Answer: If there are any specific on-premise software that you are looking for we can think about it, but mostly the direction is to move towards supporting more cloud based monitoring tools.
On-Call Management
For the Microsoft Teams integration, would we be able to have a 3rd party call a Microsoft Teams number to trigger an on-call alert?
Answer: The Microsoft Teams integration for on-call will send notifications to the on-call agent as a direct message via Servicebot.