将sd卡格式化为便携式存储,出现手机重启失败

一.问题描述. 1. 标题: 【专项测试】【T卡】【>10%】将SD卡格式化为内部存储后选择立即迁移数据,再安装应用,然后数据迁移回内部存储,最后将SD卡格式化为便携式存储,出现手机重启. 2. 前提条件: 插入Kingston 256G SD卡,不插SIM卡 3.【测试步骤】:

一.问题描述.

1. 标题:

【专项测试】【T卡】【>10%】将SD卡格式化为内部存储后选择立即迁移数据,再安装应用,然后数据迁移回内部存储,最后将SD卡格式化为便携式存储,出现手机重启.

2. 前提条件: 插入Kingston 256G SD卡,不插SIM卡

3.【测试步骤】:

 (1)插SD卡,格式化为内部存储后立即迁移数据;
 (2)连接电脑,选择文件传输模式,将apk文件拖到SD卡中;
 (3)进入设置-存储-SD卡-系统自带的文件管理器,点击拖到的3个apk进行安装;
 (4)在待机界面分别进入安装的应用后,概览界面清除后台;
 (5)进入设置-存储-设备,选择迁移数据;
 (6)设置-存储-SD卡,选择将SD卡格式化为便携式存储

4.预期结果:可以成功格式化为便携式存储

5.实际结果:格式化进度在20%时卡住,出现手机自动重启现象.

二.日志分析:

  发生了watchdog ,阻塞在ActivityManagerService.monitor()函数. Watchdog 监听android.fg 进程,当长时间阻塞时,会被watchdog 杀掉,因为是system server 进程,所以手机就发生了重启.

08-18 15:21:18.726 8517 8541 W Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS: Blocked in monitor com.android.server.am.ActivityManagerService on foreground thread (android.fg), Blocked in handler on main thread (main), Blocked in handler on display thread (android.display)
08-18 15:21:18.734 8517 8541 W Watchdog: android.fg annotated stack trace:
08-18 15:21:18.770 0 0 I [T313686] kworker/3: 1: [name:ufs_mediatek_mod&]ufsshpb_resume:3873 info: ufsshpb_lu 2 resume. do_task_work 0 retry 0
08-18 15:21:18.770 0 0 I [T313686] kworker/3: 1: [name:ufs_mediatek_mod&]ufstw_resume:686 info: ufstw resume start.
08-18 15:21:18.734 8517 8541 W Watchdog: at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:16273)
08-18 15:21:18.734 8517 8541 W Watchdog: - waiting to lock <0x0d7e51e3> (a com.android.server.am.ActivityManagerService)
08-18 15:21:18.734 8517 8541 W Watchdog: at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:354)
08-18 15:21:18.734 8517 8541 W Watchdog: at android.os.Handler.handleCallback(Handler.java:942)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.Looper.loopOnce(Looper.java:210)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.Looper.loop(Looper.java:297)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:67)
08-18 15:21:18.735 8517 8541 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
08-18 15:21:18.735 8517 8541 W Watchdog: main annotated stack trace:
08-18 15:21:18.735 8517 8541 W Watchdog: at com.android.server.wm.ActivityTaskManagerService$LocalService.isBaseOfLockedTask(ActivityTaskManagerService.java:6659)
08-18 15:21:18.735 8517 8541 W Watchdog: - waiting to lock <0x0c86a295> (a com.android.server.wm.WindowManagerGlobalLock)
08-18 15:21:18.735 8517 8541 W Watchdog: at com.android.server.pm.DeletePackageHelper.deletePackageVersionedInternal(DeletePackageHelper.java:682)
08-18 15:21:18.735 8517 8541 W Watchdog: at com.android.server.pm.PackageManagerService.deletePackageVersioned(PackageManagerService.java:3261)
08-18 15:21:18.735 8517 8541 W Watchdog: at com.android.server.pm.StorageEventHelper.onVolumeForgotten(StorageEventHelper.java:117)
08-18 15:21:18.735 8517 8541 W Watchdog: - locked <0x093b1eea> (a com.android.server.pm.PackageManagerTracedLock)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.storage.StorageManager$StorageEventListenerDelegate.lambda$onVolumeForgotten$4$android-os-storage-StorageManager$StorageEventListenerDelegate(StorageManager.java:379)
08-18 15:21:18.735 8517 8541 W Watchdog: at android.os.storage.StorageManager$StorageEventListenerDelegate$$ExternalSyntheticLambda1.run(Unknown Source:4)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Handler.handleCallback(Handler.java:942)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Looper.loopOnce(Looper.java:210)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Looper.loop(Looper.java:297)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.server.SystemServer.run(SystemServer.java:981)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.server.SystemServer.main(SystemServer.java:657)
08-18 15:21:18.736 8517 8541 W Watchdog: at java.lang.reflect.Method.invoke(Native Method)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:569)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:997)
08-18 15:21:18.736 8517 8541 W Watchdog: android.display annotated stack trace:
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.server.wm.WindowManagerService.reportFocusChanged(WindowManagerService.java:5247)
08-18 15:21:18.736 8517 8541 W Watchdog: - waiting to lock <0x0c86a295> (a com.android.server.wm.WindowManagerGlobalLock)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.server.wm.InputManagerCallback$$ExternalSyntheticLambda0.accept(Unknown Source:6)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.internal.util.function.pooled.PooledLambdaImpl.doInvoke(PooledLambdaImpl.java:281)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.internal.util.function.pooled.PooledLambdaImpl.invoke(PooledLambdaImpl.java:204)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.internal.util.function.pooled.OmniFunction.run(OmniFunction.java:97)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Handler.handleCallback(Handler.java:942)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Looper.loopOnce(Looper.java:210)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.Looper.loop(Looper.java:297)
08-18 15:21:18.736 8517 8541 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:67)
08-18 15:21:18.736 8517 8541 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
08-18 15:21:18.736 8517 8541 W Watchdog: *** GOODBYE!
08-18 15:21:18.743 8517 8541 D AES : AEEIOCTL_RT_MON_Kick IOCTL,cmd= 2147774474, lParam=330.

三. 代码分析:

    发现ActivityManagerService.java的monitor 函数,synchronized 里的this对象阻塞.

androidt/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

16271      /** In this method we try to acquire our lock to make sure that we have not deadlocked */
16272      public void monitor() {
16273          synchronized (this) { }
16274      }

四. 查看watchdog 发生的堆栈打印:

   1. ActivityManagerService.monitor 是在android.fg 线程,目前状态是Blocked. 而且是被thread 198 的线程阻塞.

"android.fg" prio=5 tid=13 Blocked
  | group="main" sCount=1 ucsCount=0 flags=1 obj=0x142c3ac0 self=0xb400007bf6e691b0
  | sysTid=1432 nice=0 cgrp=foreground sched=1073741824/0 handle=0x79faf65cb0
  | state=S schedstat=( 3468895480 3136054587 9924 ) utm=288 stm=58 core=4 HZ=100
  | stack=0x79fae62000-0x79fae64000 stackSize=1039KB
  | held mutexes=
  at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:16263)
  - waiting to lock <0x07c644c5> (a com.android.server.am.ActivityManagerService) held by thread 198
  at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:354)
  at android.os.Handler.handleCallback(Handler.java:942)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loopOnce(Looper.java:208)
  at android.os.Looper.loop(Looper.java:295)
  at android.os.HandlerThread.run(HandlerThread.java:67)
  at com.android.server.ServiceThread.run(ServiceThread.java:44)

2. 查看thread 198 的打印栈. thread 198 的线程名称是binder:1412_1C, 而且属于Blocked 状态. 被阻塞的线程是thread 1.

 "binder:1412_1C" prio=5 tid=198 Blocked
  | group="main" sCount=1 ucsCount=0 flags=1 obj=0x142e2e08 self=0xb400007bf704c590
  | sysTid=6253 nice=-2 cgrp=foreground sched=1073741824/0 handle=0x79abb1fcb0
  | state=S schedstat=( 2014008775 1150188497 9002 ) utm=146 stm=55 core=2 HZ=100
  | stack=0x79aba28000-0x79aba2a000 stackSize=991KB
  | held mutexes=
  at com.android.server.pm.PackageManagerService$PackageManagerInternalImpl.notifyPackageUse(PackageManagerService.java:6489)
  - waiting to lock <0x0eed63f5> (a com.android.server.pm.PackageManagerTracedLock) held by thread 1
  at com.android.server.am.ActivityManagerService.notifyPackageUse(ActivityManagerService.java:2849)
  at com.android.server.am.ActiveServices.realStartServiceLocked(ActiveServices.java:4396)
  at com.android.server.am.ActiveServices.bringUpServiceLocked(ActiveServices.java:4233)
  at com.android.server.am.ActiveServices.bindServiceLocked(ActiveServices.java:2956)
  at com.android.server.am.ActivityManagerService.bindServiceInstance(ActivityManagerService.java:12789)
  - locked <0x07c644c5> (a com.android.server.am.ActivityManagerService)
  at com.android.server.am.ActivityManagerService.bindServiceInstance(ActivityManagerService.java:12745)
  at android.app.ContextImpl.bindServiceCommon(ContextImpl.java:2035)
  at android.app.ContextImpl.bindServiceAsUser(ContextImpl.java:1974)
  at com.android.server.job.JobServiceContext.executeRunnableJob(JobServiceContext.java:319)
  - locked <0x02bbd241> (a java.lang.Object)
  at com.android.server.job.JobConcurrencyManager.startJobLocked(JobConcurrencyManager.java:1118)
  at com.android.server.job.JobConcurrencyManager.assignJobsToContextsInternalLocked(JobConcurrencyManager.java:898)
  at com.android.server.job.JobConcurrencyManager.assignJobsToContextsLocked(JobConcurrencyManager.java:620)
  at com.android.server.job.JobSchedulerService.maybeRunPendingJobsLocked(JobSchedulerService.java:2637)
  at com.android.server.job.JobSchedulerService.scheduleAsPackage(JobSchedulerService.java:1259)
  - locked <0x02bbd241> (a java.lang.Object)
  at com.android.server.job.JobSchedulerService$JobSchedulerStub.enqueue(JobSchedulerService.java:3114)
  at android.app.job.IJobScheduler$Stub.onTransact(IJobScheduler.java:165)
  at android.os.Binder.execTransactInternal(Binder.java:1280)
  at android.os.Binder.execTransact(Binder.java:1244)

3. 查看thread 1.  thread 1的名称是main, 也是主线程. 目前状态是Blocked. 等待的线程是thread 175.

"main" prio=5 tid=1 Blocked
  | group="main" sCount=1 ucsCount=0 flags=1 obj=0x7230aaa8 self=0xb400007bf6e33380
  | sysTid=1412 nice=-2 cgrp=foreground sched=1073741824/0 handle=0x7d42e1e4f8
  | state=S schedstat=( 7442442904 3638959461 25618 ) utm=436 stm=307 core=4 HZ=100
  | stack=0x7fe25cf000-0x7fe25d1000 stackSize=8188KB
  | held mutexes=
  at com.android.server.wm.ActivityTaskManagerService$LocalService.isBaseOfLockedTask(ActivityTaskManagerService.java:6659)
  - waiting to lock <0x06ff71a4> (a com.android.server.wm.WindowManagerGlobalLock) held by thread 175
  at com.android.server.pm.DeletePackageHelper.deletePackageVersionedInternal(DeletePackageHelper.java:682)
  at com.android.server.pm.PackageManagerService.deletePackageVersioned(PackageManagerService.java:3261)
  at com.android.server.pm.StorageEventHelper.onVolumeForgotten(StorageEventHelper.java:117)
  - locked <0x0eed63f5> (a com.android.server.pm.PackageManagerTracedLock)
  at android.os.storage.StorageManager$StorageEventListenerDelegate.lambda$onVolumeForgotten$4$android-os-storage-StorageManager$StorageEventListenerDelegate(StorageManager.java:379)
  at android.os.storage.StorageManager$StorageEventListenerDelegate$$ExternalSyntheticLambda1.run(unavailable:4)
  at android.os.Handler.handleCallback(Handler.java:942)
  at android.os.Handler.dispatchMessage(Handler.java:99) 

4. 查看线程 175.  thread 175 的名称是binder:1412_13,也是处于Blocked 状态. 发现thread 175 被 thread 1 阻塞了. 从这里就看到了一个死锁的问题. thread 175 等待 thread 1 ,而thread 1 又在等待 thread 175.

 "binder:1412_13" prio=5 tid=175 Blocked
  | group="main" sCount=1 ucsCount=0 flags=1 obj=0x142e29d0 self=0xb400007bf6fe40d0
  | sysTid=5336 nice=-10 cgrp=foreground sched=1073741824/0 handle=0x7967815cb0
  | state=S schedstat=( 3461362720 2761431048 14469 ) utm=247 stm=98 core=2 HZ=100
  | stack=0x796771e000-0x7967720000 stackSize=991KB
  | held mutexes=
  at com.android.server.pm.UserManagerService.isQuietModeEnabled(UserManagerService.java:1283)
  - waiting to lock <0x0eed63f5> (a com.android.server.pm.PackageManagerTracedLock) held by thread 1
  at android.os.UserManager.isQuietModeEnabled(UserManager.java:4567)
  at com.android.server.wm.ActivityStartInterceptor.interceptQuietProfileIfNeeded(ActivityStartInterceptor.java:232)
  at com.android.server.wm.ActivityStartInterceptor.intercept(ActivityStartInterceptor.java:166)
  at com.android.server.wm.ActivityStarter.executeRequest(ActivityStarter.java:1059)
  at com.android.server.wm.ActivityStarter.execute(ActivityStarter.java:702)
  - locked <0x06ff71a4> (a com.android.server.wm.WindowManagerGlobalLock)
  at com.android.server.wm.ActivityTaskManagerService.startActivityAsUser(ActivityTaskManagerService.java:1250)
  at com.android.server.wm.ActivityTaskManagerService.startActivityAsUser(ActivityTaskManagerService.java:1213)
  at com.android.server.wm.ActivityTaskManagerService.startActivity(ActivityTaskManagerService.java:1188)
  at android.app.IActivityTaskManager$Stub.onTransact(IActivityTaskManager.java:893)
  at com.android.server.wm.ActivityTaskManagerService.onTransact(ActivityTaskManagerService.java:5241)
  at android.os.Binder.execTransactInternal(Binder.java:1280)
  at android.os.Binder.execTransact(Binder.java:1244)

5. 整理watchdog 发生的原因:

简单示意图,死锁的原因是tid=1 和 tid=175 相互等待导致的,只要解此锁即可.

                                      "android.fg"(tid=13) 
                                             |
                                             |
                                           等待
                                             |
                                             |
                                             V
                                   "binder:1412_1C"(tid=198)
                                            |
                                            |
                                          等待
                                            |
                                            |
                                           V
                                      "main"(tid=1)<-------------------
                                             |                                      |
                                             |                                      |
                                            等待                              等待
                                             |                                      |
                                             |                                      |
                                            V                                      |
                                     "binder:1412_13"(tid=175)-----

五. 查看代码.

 1. androidt/frameworks/base/services/core/java/com/android/server/pm/StorageEventHelper.java

    此时在做sdcard 格式化,调用了deletePackageVersioned 函数.

103      @Override
104      public void onVolumeForgotten(String fsUuid) {
105          if (TextUtils.isEmpty(fsUuid)) {
106              Slog.e(TAG, "Forgetting internal storage is probably a mistake; ignoring");
107              return;
108          }
109  
110          // Remove any apps installed on the forgotten volume
111          synchronized (mPm.mLock) {
112              final List<? extends PackageStateInternal> packages =
113                      mPm.mSettings.getVolumePackagesLPr(fsUuid);
114              for (PackageStateInternal ps : packages) {
115                  Slog.d(TAG, "Destroying " + ps.getPackageName()
116                          + " because volume was forgotten");
117                  mPm.deletePackageVersioned(new VersionedPackage(ps.getPackageName(),
118                                  PackageManager.VERSION_CODE_HIGHEST),
119                          new PackageManager.LegacyPackageDeleteObserver(null).getBinder(),
120                          UserHandle.USER_SYSTEM, PackageManager.DELETE_ALL_USERS);
121                  // Try very hard to release any references to this package
122                  // so we don't risk the system server being killed due to
123                  // open FDs
124                  AttributeCache.instance().removePackage(ps.getPackageName());
125              }
126  
127              mPm.mSettings.onVolumeForgotten(fsUuid);
128              mPm.writeSettingsLPrTEMP();
129          }
130      }

2. androidt/frameworks/base/services/core/java/com/android/server/wm/ActivityTaskManagerService.java

6657          @Override
6658          public boolean isBaseOfLockedTask(String packageName) {
6659              synchronized (mGlobalLock) {
6660                  return getLockTaskController().isBaseOfLockedTask(packageName);
6661              }
6662          }

3.  androidt/frameworks/base/services/core/java/com/android/server/wm/ActivityStarter.java

685              int res;
686              synchronized (mService.mGlobalLock) {
687                  final boolean globalConfigWillChange = mRequest.globalConfig != null
688                          && mService.getGlobalConfiguration().diff(mRequest.globalConfig) != 0;
689                  final Task rootTask = mRootWindowContainer.getTopDisplayFocusedRootTask();
690                  if (rootTask != null) {
691                      rootTask.mConfigWillChange = globalConfigWillChange;
692                  }
693                  ProtoLog.v(WM_DEBUG_CONFIGURATION, "Starting activity when config "
694                          + "will change = %b", globalConfigWillChange);
695  
696                  final long origId = Binder.clearCallingIdentity();
697  
698                  res = resolveToHeavyWeightSwitcherIfNeeded();
699                  if (res != START_SUCCESS) {
700                      return res;
701                  }
702                  res = executeRequest(mRequest);
703  
704                  Binder.restoreCallingIdentity(origId);

4.androidt/frameworks/base/services/core/java/com/android/server/pm/UserManagerService.java

从堆栈中可以看出.mPackagesLock 和mPm.mLock 是同一个对象,都是0x0eed63f5.

1281      @Override
1282      public boolean isQuietModeEnabled(@UserIdInt int userId) {
1283          synchronized (mPackagesLock) {
1284              UserInfo info;
1285              synchronized (mUsersLock) {
1286                  info = getUserInfoLU(userId);
1287              }
1288              if (info == null || !info.isManagedProfile()) {
1289                  return false;
1290              }
1291              return info.isQuietModeEnabled();
1292          }
1293      }

 五. 发现.

    从堆栈和代码中发现,"main"(tid=1) 是在做sdcard 格式化,而,"binder:1412_13"(tid=175) 里面有startActivityAsUser 函数,可以推测出来,是在做启动activity.

  也就是说格式化sdcard 时,又activity 会造成死锁.

  为了验证这种可能性,进行了验证.

  在格式化sdcard 时,开启camera,使用命令:

adb shell am start -a "android.media.action.STILL_IMAGE_CAMERA"

测试结果,果然复现.

六. 方案:

     临时方案:在格式化手机,禁止activity 启动.这个方案有可能涵盖不全.

    1.  androidt/frameworks/base/services/core/java/com/android/server/wm/ActivityStarter.java

import com.android.server.pm.StorageEventHelper;

    int execute() {
....
            int res;
            synchronized (mService.mGlobalLock) {
                final boolean globalConfigWillChange = mRequest.globalConfig != null
                        && mService.getGlobalConfiguration().diff(mRequest.globalConfig) != 0;
                final Task rootTask = mRootWindowContainer.getTopDisplayFocusedRootTask();
                if (rootTask != null) {
                    rootTask.mConfigWillChange = globalConfigWillChange;
                }
                ProtoLog.v(WM_DEBUG_CONFIGURATION, "Starting activity when config "
                        + "will change = %b", globalConfigWillChange);

                final long origId = Binder.clearCallingIdentity();

                res = resolveToHeavyWeightSwitcherIfNeeded();
                if (res != START_SUCCESS) {
                    return res;
                }
          +++      //XIBIN BEGIN
          +++      if(StorageEventHelper.volumeForgotten){
          +++         Slog.e(TAG,"the volume is forgetting,interrupt activity start.");
           +++          return START_SUCCESS;
           +++      }
           +++     //XIBIN END
                res = executeRequest(mRequest);

2. androidt/frameworks/base/services/core/java/com/android/server/pm/StorageEventHelper.java

    //XIBIN BEGIN
    public static boolean volumeForgotten = false;
    //XIBIN END


    @Override
    public void onVolumeForgotten(String fsUuid) {
        if (TextUtils.isEmpty(fsUuid)) {
            Slog.e(TAG, "Forgetting internal storage is probably a mistake; ignoring");
            return;
        }

        // Remove any apps installed on the forgotten volume
     +++  //XIBIN BEGIN
     +++  try{
     +++     volumeForgotten = true;
        synchronized (mPm.mLock) {
            final List<? extends PackageStateInternal> packages =
                    mPm.mSettings.getVolumePackagesLPr(fsUuid);
            for (PackageStateInternal ps : packages) {
                Slog.d(TAG, "Destroying " + ps.getPackageName()
                        + " because volume was forgotten");
                mPm.deletePackageVersioned(new VersionedPackage(ps.getPackageName(),
                                PackageManager.VERSION_CODE_HIGHEST),
                        new PackageManager.LegacyPackageDeleteObserver(null).getBinder(),
                        UserHandle.USER_SYSTEM, PackageManager.DELETE_ALL_USERS);
                // Try very hard to release any references to this package
                // so we don't risk the system server being killed due to
                // open FDs
                AttributeCache.instance().removePackage(ps.getPackageName());
            }

            mPm.mSettings.onVolumeForgotten(fsUuid);
            mPm.writeSettingsLPrTEMP();
        }
      +++ }finally{
      +++  volumeForgotten = false;
     +++ }
   +++   //XIBIN END

知秋君
上一篇 2024-07-04 16:48
下一篇 2024-07-04 16:12

相关推荐