Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[voq/systemlag] Lag id boundary setting for system lag functionality #6488

Merged
merged 1 commit into from
Mar 31, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions files/build_templates/docker_image_ctl.j2
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,19 @@ function preStartAction()
updateSyslogConf
}

{%- if docker_container_name == "database" %}

function setPlatformLagIdBoundaries()
{
CHASSIS_CONF=/usr/share/sonic/device/$PLATFORM/chassisdb.conf
if [ -f "$CHASSIS_CONF" ]; then
source $CHASSIS_CONF
$SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"
judyjoseph marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia We tested this change and it's not working at our side. $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start" failed with this error: Invalid database name input : 'CHASSIS_APP_DB'. However, it worked fine if we do this inside chassisDb docker. E.g.,

docker exec -i ${DOCKERNAME} $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"

So it seems sonic-db-cli is not aware of CHASSIS_APP_DB when databash.sh is executed. Any thought on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman, This is supposed to be run only in database-chassis docker in supervisor card only. Assuming you are running this in database-chassis docker in supervisor, please check if you have the CHASSIS_APP_DB defined in database_config.json for redis-chassis instance (i.e. in file /var/run/redis-chassis/sonic-db/database_config.json)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia I am not suer that this is actually run in database-chassis docker based on the following code:

https://github.com/Azure/sonic-buildimage/blob/master/files/build_templates/docker_image_ctl.j2#L174-L178

Line 174-175 are running outside the docker, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman: you are right. It is run outside docker. We can modify the function setPlatfromLagIdBoundaries() to do "docker exec ...". However we do not have to do this or even to run this inside docker. The error you are seeing is because of this CHASSIS_APP_DB not defined in /var/run/redis/sonic-db/database_config.json. We are supposed to have the same content of database_config.json in all instances of /var/run/redis<instance>/sonic-db. If CHASSIS_APP_DB is defined in database_config.json with correct instance and if the instance info is specified, sonic-cfggen will connect to correct redis server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia We do have CHASSIS_APP_DB defined in database_config.json once switch boots up. However, it appears that /var/run/redis/sonic-db/database_config.json didn't exist yet when setPlatformLagIdBoundaries was invoked. The file was installed after chassisDb was up. Do you know if SONiC ensures /var/run/redis/sonic-db/database_config.json is installed before bringing up chassisDb?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Unfortunately, we were observing different from what you explained. The way we set lag id range successfully is to modify setPlatformLagIdBoundaries to set it directly inside chassis database docker like this:

docker exec -i ${DOCKERNAME} $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"

When setPlatformLagIdBoundaries wass called in the script (ref. database.sh) that starts the database docker, it seems only /var/run/redis-chassis/sonic-db/database_config.json was available but not /var/run/redis/sonic-db/database_config.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysmanman, this is as designed. Chassis db starts before regular database starts. So we do not expect the /var/run/redis/sonic-db/database_config.json be available when database-chassis container is started and redis-chassis server started inside database-chassis container. I just checked the docker_image_ctl.j2. The function setPlatformLagIdBoundaries() is missing some piece of code. I do not know how we missed to push these changes to github. Following is the function we have:

function setPlatformLagIdBoundaries()
{
    if [ ! -e "/var/run/redis/sonic-db/database_config.json" ]; then
        mkdir -p /var/run/redis/sonic-db/
        cp /var/run/redis-chassis/sonic-db/database_config.json /var/run/redis/sonic-db
    fi
    CHASSIS_CONF=/usr/share/sonic/device/$PLATFORM/chassisdb.conf
    if [ -f "$CHASSIS_CONF" ]; then
        source $CHASSIS_CONF
        $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_START" "$lag_id_start"
        $SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_END" "$lag_id_end"
    fi
}

But your fix is also valid. I'll put a PR to fix this. Thanks for helping to find this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Thanks for checking and confirming!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia Thanks for checking and confirming!

@ysmanman, raised issue #7912 and fixed the issue by PR #7911

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia thanks for the update!

$SONIC_DB_CLI CHASSIS_APP_DB SET "SYSTEM_LAG_ID_END" "$lag_id_end"
fi
}
{%- endif %}

function postStartAction()
{
{%- if docker_container_name == "database" %}
Expand Down Expand Up @@ -162,6 +175,7 @@ function postStartAction()
($(docker exec -i ${DOCKERNAME} $SONIC_DB_CLI CHASSIS_APP_DB PING | grep -c True) -gt 0) ]]; do
sleep 1
done
setPlatformLagIdBoundaries
REDIS_SOCK="/var/run/redis-chassis/redis_chassis.sock"
fi
chgrp -f redis $REDIS_SOCK && chmod -f 0760 $REDIS_SOCK
Expand Down