-
Notifications
You must be signed in to change notification settings - Fork 405
Troubleshooting
Follow the server logs in case of errors:
- apache-tomcat-x/logs/catalina.out
- apache-tomcat-x/logs/asgard.log
If you cannot determine the solution, search the Asgard Google group for the error. If you don't find a solution, post your question there.
If your account contains a large number of cloud objects you might need to increase Asgard’s memory footprint. It’s also advisable to change the JVM garbage collection settings. To do make these changes, create a setenv.sh file in apache-tomcat-x/bin/ similar to the following code block, then restart Tomcat.
if [ "$1" == "start" ]; then
export JAVA_OPTS=" \
-verbose:sizes \
-Xmx4g -Xms4g \
-Xmn2g \
-XX:MaxPermSize=256m \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:-UseGCOverheadLimit \
-XX:+ExplicitGCInvokesConcurrent \
-XX:+PrintGCDateStamps -XX:+PrintGCDetails \
-XX:+PrintTenuringDistribution \
-XX:+CMSClassUnloadingEnabled \
-XX:+UseConcMarkSweepGC \
"
else
export JAVA_OPTS=""
fi
If Asgard is failing to start up after several minutes, check your asgard.log file. See if it shows a lot of exceptions filling caches, like one of these messages:
[2012-11-05 17:08:43,772] [background-process-4] com.netflix.asgard.CachedMap Exception filling cache us-east-1 Security Group
com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[5097,13]
Message: Read timed out)
[2012-11-26 17:47:32,905] [background-process-6] com.netflix.asgard.CachedMap Exception filling cache ap-northeast-1 Spot Instance Request
com.amazonaws.AmazonClientException: Unable to execute HTTP request: Connect to ec2.ap-northeast-1.amazonaws.com/ec2.ap-northeast-1.amazonaws.com/27.0.2.68 timed out
If so, there might be a lot of metadata in your AWS accounts and/or you may have a flaky network connection between your Asgard instance and some AWS API endpoints.
Check http://localhost:8080/cache/list to identify which caches are failing to load. Note that some of them can take a few minutes.
If you suspect a cache recently failed to load once and you want to instigate an immediate attempt to fill that cache again, find the cache's name from the cache list linked above, such as Multi-region Queue
, and then send an HTTP POST request to Asgard like this:
curl -d "id=Multi-region Queue" http://localhost:8080/cache/fill
If you have enough patience then there is no reason to force a cache to fill, because all caches try to fill themselves regularly anyway.
If you want to start up Asgard and let users access it before all the caches are warm, then some of your cloud object lists will be empty for a while until their caches succeed in pulling data from Amazon and other endpoints. To skip the cache fill initialization process, start up Asgard with this JVM system property flag:
-DskipCacheFill=true
You can also limit the set of AWS regions your Asgard instance will interact with. To specify the region names that you want to use, start up Asgard with a comma-separated list of region names in this JVM system property:
-DonlyRegions=us-east-1,us-west-1,us-west-2,eu-west-1,ap-northeast-1
In Tomcat, you can use skipCacheFill
and/or onlyRegions
by adding parameters to the JAVA_OPTS configuration in your apache-tomcat-x/bin/setenv.sh file
if [ "$1" == "start" ]; then
export JAVA_OPTS=" \
-DskipCacheFill=true \
-DonlyRegions=us-east-1,us-west-2,sa-east-1,ap-northeast-1 \
"
else
export JAVA_OPTS=""
fi
If you want to diagnose the state of a misbehaving server, run this command to dump the current threads to catalina.out:
kill -3 `ps -ef | grep java.*tomcat | grep -v grep | awk '{print$2}'` &
On Mac or Linux, if shutdown.sh does not successfully stop Tomcat, run this command to kill the Tomcat process:
kill -9 `ps -ef | grep java.*tomcat | grep -v grep | awk '{print$2}'` &
Sometimes a problem with a remote dependency causes all Asgard users to require an upgrade. If everyone's Asgard instance seems happy with the upgrade except yours, try deleting the work
directory in Tomcat. This can help remove any cached files from an older Asgard installation.
To be certain of the environment settings of your Asgard instance, access http://localhost:8080/server/props to look at things your JDK vendor and version number, your grails.env, GRAILS_HOME, user.home, user.name, java.version, grails.version and anything else that looks like a clue for troubleshooting.
This seems to mainly affect Windows users, but there have been reports of Linux users seeing this on occasion. If the Asgard config screen won't save credentials, try setting an ASGARD_HOME
environment variable to a known writable directory and restart Tomcat.
There is currently a sporadic bug on some forms in Asgard when used in Firefox. For safety while we work on a robust fix, Asgard is read-only in Firefox.
A Netflix Original Production
Tech Blog | Twitter @NetflixOSS | Jobs