Analysis of two reasons for "read error on connection"

created at 07-04-2021 views: 45

Preface

Recently, the online php module occasionally prompt error read error on connection; the specific error log is as follows:

Uncaught exception 'RedisException' with message 'read error on connection' 

After analysis and learning, two reasons were found that may cause phpredis to return read error on connection:

  • Execution timeout
  • Use a disconnected connection

The following will conduct a specific analysis of these two situations.

1. execution timeout

Timeout can be divided into two situations: one is caused by the timeout period set by the client is too short; the other is that the client does not set the timeout period, but the server execution time exceeds the default timeout period setting.

1.1 Simulated reproduction

1.1.1 The timeout period set by the client is too short

The get operation in the test environment takes about 0.1ms to execute; therefore, the client sets the execution timeout time to 0.01ms, and the test script is as follows:

<?php
$rds = new Redis();
try {
    $ret = $rds->pconnect("127.0.0.1", 6390);
    if ($ret == false) {
        echo "Connect return false";
        exit;
    }
    //Set the timeout time to 0.1ms
    $rds->setOption(3,0.0001);
    $rds->get("aa");
} catch (Exception $e) {
    var_dump ($e);
}

Manual execution of the script will catch the read error on connection exception;

read error on connection

1.1.2 The client does not set the timeout time, and the default timeout time is used
The client does not set a timeout period, but during the execution of the command, the timeout reaches the default value set by php. For details, see phpredis subscribe timeout problem and solution analysis

1.2 Cause analysis

1.2.1 strace analysis
Viewing the execution process through strace, you can find that after sending the get aa command, poll wants to pull the POLLIN event and waits for a timeout:

strace analysis

1.2.2 Code logic analysis
The php connection redis uses the phpredis extension. You can find this error is located in the redis_sock_gets function of the phpredis/library.c file by searching the full text of the phpredis source code for'read error on connection', see phpredis for details;

The redis_sock_gets function of the library.c file of phpredis

/*
 * Processing for variant reply types (think EVAL)
 */

PHP_REDIS_API int
redis_sock_gets(RedisSock *redis_sock, char *buf, int buf_size,
                size_t *line_size)
{
    // Handle EOF
    if(-1 == redis_check_eof(redis_sock, 0)) {
        return -1;
    }

    if(php_stream_get_line(redis_sock->stream, buf, buf_size, line_size)
                           == NULL)
    {
        char *errmsg = NULL;

        if (redis_sock->port < 0) {
            spprintf(&errmsg, 0, "read error on connection to %s", ZSTR_VAL(redis_sock->host));
        } else {
            spprintf(&errmsg, 0, "read error on connection to %s:%d", ZSTR_VAL(redis_sock->host), redis_sock->port);
        }
        // Close our socket
        redis_sock_disconnect(redis_sock, 1);

        // Throw a read error exception
        REDIS_THROW_EXCEPTION(errmsg, 0);
        efree(errmsg);
        return -1;
    }

    /* We don't need \r\n */
    *line_size-=2;
    buf[*line_size]='\0';

    /* Success! */
    return 0;
}

Attachment: This msg seems to have more host and port than the online msg, it is because of the recent merged branch, as shown in the figure

message

It can be found from the source code that if php_stream_get_line reads stream data as NUll, it will throw a read error on connection error. So when will php_stream_get_line return NULL, corresponding to the php-src/main/streams/streams.c file of the php source code, see php-src for details;

/* If buf == NULL, the buffer will be allocated automatically and will be of an
 * appropriate length to hold the line, regardless of the line length, memory
 * permitting */
PHPAPI char *_php_stream_get_line(php_stream *stream, char *buf, size_t maxlen,
        size_t *returned_len)
{
    size_t avail = 0;
    size_t current_buf_size = 0;
    size_t total_copied = 0;
    int grow_mode = 0;
    char *bufstart = buf;

    if (buf == NULL) {
        grow_mode = 1;
    } else if (maxlen == 0) {
        return NULL;
    }

    /*
     * If the underlying stream operations block when no new data is readable,
     * we need to take extra precautions.
     *
     * If there is buffered data available, we check for a EOL. If it exists,
     * we pass the data immediately back to the caller. This saves a call
     * to the read implementation and will not block where blocking
     * is not necessary at all.
     *
     * If the stream buffer contains more data than the caller requested,
     * we can also avoid that costly step and simply return that data.
     */

    for (;;) {
        avail = stream->writepos - stream->readpos;

        if (avail > 0) {
            size_t cpysz = 0;
            char *readptr;
            const char *eol;
            int done = 0;

            readptr = (char*)stream->readbuf + stream->readpos;
            eol = php_stream_locate_eol(stream, NULL);

            if (eol) {
                cpysz = eol - readptr + 1;
                done = 1;
            } else {
                cpysz = avail;
            }

            if (grow_mode) {
                /* allow room for a NUL. If this realloc is really a realloc
                 * (ie: second time around), we get an extra byte. In most
                 * cases, with the default chunk size of 8K, we will only
                 * incur that overhead once.  When people have lines longer
                 * than 8K, we waste 1 byte per additional 8K or so.
                 * That seems acceptable to me, to avoid making this code
                 * hard to follow */
                bufstart = erealloc(bufstart, current_buf_size + cpysz + 1);
                current_buf_size += cpysz + 1;
                buf = bufstart + total_copied;
            } else {
                if (cpysz >= maxlen - 1) {
                    cpysz = maxlen - 1;
                    done = 1;
                }
            }

            memcpy(buf, readptr, cpysz);

            stream->position += cpysz;
            stream->readpos += cpysz;
            buf += cpysz;
            maxlen -= cpysz;
            total_copied += cpysz;

            if (done) {
                break;
            }
        } else if (stream->eof) {
            break;
        } else {
            /* XXX: Should be fine to always read chunk_size */
            size_t toread;

            if (grow_mode) {
                toread = stream->chunk_size;
            } else {
                toread = maxlen - 1;
                if (toread > stream->chunk_size) {
                    toread = stream->chunk_size;
                }
            }

            php_stream_fill_read_buffer(stream, toread);

            if (stream->writepos - stream->readpos == 0) {
                break;
            }
        }
    }

    if (total_copied == 0) {
        if (grow_mode) {
            assert(bufstart == NULL);
        }
        return NULL;
    }

    buf[0] = '\0';
    if (returned_len) {
        *returned_len = total_copied;
    }

    return bufstart;
}

It can be seen from the php_stream_get_line method that NULL will be returned only when bufstart=NULL, and bufstart=NULL means that no data has been received in the buf buffer and stream, including the terminator.

1.3 Solution

There are two ways for the client to set a reasonable timeout period:

1.3.1 int_set

ini_set('default_socket_timeout', -1);

1.3.2 setOption

$redis->setOption(Redis::OPT_READ_TIMEOUT, -1);

Note: -1 means no timeout, you can also set the timeout to the time you want, the previous time is set to 0.01ms

2. Reuse the disconnected connection

Using a disconnected connection may also cause'read error on connection'. Here we need to distinguish between'Connection closed' and'Connection lost'.

2.1 Disconnected

2.1.1 Connection closed
The test script is as follows, the client actively closes the connection, but then uses the broken link below, and then throws an exception and returns connection closed

<?php
$rds = new Redis();
try {
    $ret = $rds->pconnect("127.0.0.1", 6390);
    if ($ret == false) {
        echo "Connect return false";
        exit;
    }

    $rds->close();

    var_dump($rds->get("aa"));
} catch (Exception $e) {
    var_dump ($e);
}

The test results are as follows:

 test results

2.1.2 Connection lost
Refer to Work around PHP bug of liveness checking to write a test script test.php as follows. After connecting to redis, kill the redis process before executing the command:

<?php
$rds = new Redis();
try {
    $ret = $rds->pconnect("127.0.0.1", 6390);
    if ($ret == false) {
        echo "Connect return false";
        exit;
    }

    echo "Press any key to continue ...";
    fgetc(STDIN);
    var_dump($rds->get("aa"));
} catch (Exception $e) {
    var_dump ($e);
}

in case

The execution steps are as follows

  • The terminal executes the php test.php script
  • Open another terminal to kill the redis process
  • Arbitrary input in the first terminal, press Enter

At this time,Connection lost will appear

Connection lost

2.1.3 read error on connection
After connecting to redis, in the process of continuously executing commands, if the connection is disconnected, read error on connection will be returned. The test script is as follows:

<?php
$rds = new Redis();
try {
    $ret = $rds->pconnect("127.0.0.1", 6390);
    if ($ret == false) {
        echo "Connect return false";
        exit;
    }

    while(1){
       $rds->get("aa");
    }

} catch (Exception $e) {
    var_dump ($e);
}

in case

The execution steps are as follows

  • The terminal executes the php test.php script
  • Open another terminal to kill the redis process

An exception is thrown at this time:

exception

Or the newly opened terminal connects to the redis server and executes client kill as follows:

client kill

The php script being executed will also catch the exception read error on connection.

2.2 php-fpm & pconnect

In the cli mode, connect to the redis server through pconnect through php. Although the business code shows that close is called, the connection is not disconnected. Fpm will maintain the connection to redis. The next request will not be executed when pconnect is executed again. Will really request redis to establish a connection. This will also cause a problem. If the connection has been disconnected, the next request may directly use the last disconnected connection. For this, phpredis also has comments in its source code, see php-src for details

php-fpm & pconnect

So php-fpm reuse a broken connection may cause this kind of error.

The simplest solution to this situation is to change the long link to short link

3. summary

There are many analyses on execution timeouts and their solutions on the Internet, but there are few analyses on disconnected and reused connections. Therefore, this analysis is used as a record on the one hand, and on the other hand, I hope to give some help to small partners facing the same problem.

created at:07-04-2021
edited at: 07-04-2021: