-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUERY] Azure Communication Services Chat slow/long response times #47463
Comments
Also seeing similar behaviour with Communication Services Email. Connections closing or timing out around 1 minute at peak load that should be well within our sending quota. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @abhishesingh-msft @abhiucandoit @adamtuck-msft @adriansynal-msft @AikoBB @ajaych-ms @ajitht-msft @akania @alcail @alexandra142 @AliRafiee @alkwa-msft @allchiang-msft @allenplusplus @AlonsoMondal @alvinjiang-msft @amagginetti @amandaong @amariwest-msft @amatukmolina @AmeliaHeMS @AMollis @andboyer @andrewjhopkins @angelcaz-msft @angellan-msft @angiurgiu @anjulgarg @ankeshni @ankitamm @ankitarorabit @anmolbohra97 @ansrin @antonsamson-msft @anujb-msft @anujissarMS @arifibrahim4 @arifsaikat-microsoft @armandliv @armansabaa @arupdutta-msft @ashwinder @BalajiUmmadisetti @beltr0n @besh2014 @bga-msft @bharat-kalyan-namburi @bobserr @boris-bazilevskiy @brpiment @bterlson @calvinkwtang @carlosalm-msft @carocao-msft @cemateia @chengyuanlai-msft @cheukchuen @CHILIU-MSFT @chriswhilar @chrwhit @claireoberg-msft @cn0151 @cochi2 @csandjon @dailam-msft @danielav7 @danielgerlag @danielortega-msft @dassabya15 @DaybreakQuip @ddouglas-msft @devwolf1 @dinazavyr @dmceachernmsft @DominikMe @dvillasenor-msft @dzeliar @ealejandrootalvaro @edwardlee-msft @elavarasidc @emlynmac @enguerrandb-msft @enricohuang @ericasp16 @eriwang-msft @fangchen0601 @fanruisun @FarhadJabiyev @FerOsorio @fhaghbin-msft @fizampou @fuyan2024 @gaobob-msft @garchiro7 @Gemakk @genevievetok @gfeitosa-msft @glorialimicrosoft @GoWang @grangasamy-msft @grigoryk @guoqing2023 @hansung-msft @hrazi @huachuandeng @HuangXin-MS @iaulakh @ihuseynov-msft @ikumarapeli-msft @ilyapaliakou-msft @imikemo @jadacampbell @jakublehotsky @JamesBurnside @jamescadd @jannovak-msft @jdebroin-msft @jethier-msft @JianpingChen @jimchou-dev @jiminwen-msft @jiriburant @jirisofka @jjsanchezms @Joeleniqs @jorge-beauregard @josecomboni @JoshuaLai @jowang-msft @jpeng-ms @jrathor @jsaurezlee-msft @juancamilor @juntuchen-msft @jutik0 @kagbakpem @khannavikas @kieraniles-msft @ktimofejev-msft @kurtzeborn @Leah-Xia-Microsoft @lei-msft @lisaleehan @LoadLibrary @lsundaralingam @LuChen-Microsoft @lucianopa-msft @magesh-ms @mannyovena @marche0133 @mariusu-msft @matthohn-msft @maximrytych-ms @mayssamm @megheshpatil @mgamis-msft @miguhern @mikehang-msft @MilanKaur-01 @minnieliu @minwoolee-msft @mjafferi-msft @mmpowers-mi @msft-qifan @MSFTFox @namratasimha-msft @natekimball-msft @NathanJBennett @navali-msft @nemofei11 @nikithauc @nikuklic @ninikasharma @nmaredia @nmurav @nostojicMs @osaghaso @palatter @paolamvhz @Paresh-Arvind-Patil @paveldostalms @pavelprystinka @pereiralex @pgrandhi @phermanov-msft @pkestikar @pohtsng @poorva-MSFT @prabhjot-msft @prasadker @priyankaaprakash @rajasekaran2003 @rajat-rastogi @rajuanitha88 @ralphgabrinao @ramneettung-msft @raosanat @rasinive @RezaJooyandeh @rkprasad-ks @rorezende @ryturn757 @rzdor @sacheu @Saisang @SamuelSA @sankum-msft @sarkar-rajarshi @satyakonmsft @Scott-Leong @Shamkh @sharifrahaman @ShaunaSong @shirleyqin-msft @shwali-msft @slpavkov @sofiar-msft @soricos85 @sorrego-msft @sphenry @Srinuvasu-Bodepudi @stefang931 @subhasan @swagatmishra2007 @tadam-msft @tariqzafa700 @tomaschladek @tonyliu43361 @tophpalmer @usvoyager @v-durgeshs @v-mbarad-msft @v-pivamshi @v-shazilms @v-vdharmaraj @vaibhavjain-msft @vamoskal @vhuseinova-msft @VikramDhumal @vikrampraveen @viniciusl-msft @vinoPuzzle @vivekmore-msft @vriosrada-msft @wangrui-msft @waynemo @whisper6284 @williamzhao87 @xixian73 @xumo-95 @xxwikkixx @yassirbisteni @yogeshmo @ypradhan-msft @zeyingguo1991 @zhengni-msft @zihzhan-msft. |
GetAccessToken ( this can't be expected or within range of what should be considered normal? |
Some more slow invocations or with timeouts: |
Hi @ErikAndreas Can you provide MS-CV for the requests as well? They are in the response headers. And also the timestamp for making the requests. |
Hi @jiminwen-msft a few other samples with timestamps (UTC) though:
as you can understand the end user experience for this one request to our API (there are more requests being made from this function invocation to ACS but these were some of the slowest logged responses ("duration" in app insights)) is not that great... |
other endpoints as well: invocationid 46aa3e10-dde6-4396-98c5-dca89c4ddc33 |
@ErikAndreas |
yes, issue still there - slowest endpoint and having most timeouts is a few slow entries just now: 2024-12-18T08:01:19.3342841Z 2024-12-18T08:01:19.9186785Z looking at the performance blade of app insights with 'target has '(our acs instance).europe.communication.azure.com' for the last 24 hrs, we're still at the same approximate numbers as my previous screenshot; 50th at 710 ms, 99th at 2.4 s - rather slow for an api I'd say... |
Since the slowest response are Auth related problems. I suggest resolving this first. |
I cannot find the requests by querying the thread IDs on our service. The requests may not arrive on our end. Can you share the resource ID. It is available on the overview page of your ACS resource. @ErikAndreas |
Not following, the GetToken(Anync) method (which I assume calls underlaying rest method
Can I somehow provide you the resource id privately (I can't post it publicly)? But, you can't be suggesting that they're not arriving? We are getting replies/data, although very slowly... |
@ErikAndreas Does the issue apply to a small percentage of requests? (p95, p99) |
No, as previously stated and shown by my screenshot from app insights we're rather constant on +600 ms for 50th. I strongly disagree with any of the numbers on your graph (lowest at 350) as something to be considered "normal", I'd say such response times are way above anything to be expected from a modern API? Consider any real feature would require multiple calls to the service, again, the ux of such feature is not that great! Our experience though is that it wasn't this bad when we started using this service but we don't have any numbers to show for it. Can we get an "official" statement on those numbers to be "normal", to be what we should expect, for ACS (Chat)? |
@ErikAndreas The graph above is list_messages operation on service side. Does it match the latency for this particular operation on your side. If it matches, then the latency is expected. If the latency on your side is significant higher. It can be a network issue, or something on the SDK. |
So your graph/numbers are excluding GetAccessToken? Might make sense. Please do check with other team. Will be off for holidays now and check back later. |
@jiminwen-msft I'd say we're seeing higher response times than indicated by your service side metrics - but that's from a SDK usage perspective and we have reasons to believe that the SDK does not handle paging very well, see #45816 (page size seems to be fixed at six messages) - any feedback on that? |
Did you trying using Azure CLI to invoke the target endpoints with the same request? If the high latency issues exist for Azure ALI, it should cause by network settings. The SDK itself does not do much extra processing after getting responses from the service side. |
I think max_page_size can be set a parameter when listing messages. Can you try setting a higher value? |
@ErikAndreas Are you running your app in Azure Web Jobs or Functions by any chance? I am having a lot of trouble with throughput in Web Jobs and QueueListeners creating a lot of outgoing TCP connections seem to be a factor. See Azure/azure-webjobs-sdk#3116 |
@gabbsmo as per original issue description we're using Azure Functions .net 8 isolated (both consumption and premium plans, varying by environment but I'd say we see same slowness regardless) |
Library name and version
Azure.Communication.Chat 1.3.1
Query/Question
We're constantly seeing slower and slower response times from various chat sdk endpoints (using chat sdk from azure functions). Most endpoints are in +200 ms, but sometimes we get timeouts (code 499?)
This makes e.g. listing chat threads (GetAccessToken, GetMessages + GetReadReceipts) and some info about their contents, a rather "standard" use-case, almost unbearable from a ux perspective when each call takes such a long time.
InvocationId f04bd15e-a07b-4bf1-8c5b-5c3c43f9b1f5 makes 11 request to Chat SDK endpoints with response times in the range of 101 - 352 ms
We're also seeing cases where GetAccessToken (
POST /identities/<ACS UserId>/:issueAccessToken
) completely times out:b5b1845e-fa3c-4000-8632-315256c01552
3206e6ee-1623-482c-8f8c-609fafc80d80
3a121576-8b85-4a35-a3c9-598e95e67f9e
Environment
Azure Functions .net 8 iwp
The text was updated successfully, but these errors were encountered: